By: Ram on Jun 18, 2020
When do you use Multiple Regression?
Based on the scale of measurement, variables can be defined as Binary, Ordinal, Nominal, and Continuous (Ratio and Interval Scale) type. When a decision (or target/dependent) variable is continuous, one of the Statistical Methods available for building the model is multiple regression. These types of scenarios or problems are classified as Regression problems.
Some of the scenarios and ideas are list below. These examples are across functional areas and business verticals.
Multiple Regression Applications across Industries.
Industry Vertical 
Scenario 
Scenario Description 

Human Resource 
Salary Estimate 
Predicting or estimating the salary of a person based on a set of attributes such as years of experience, level of education, an industry of work, previous job salary etc. 

Human Resources 
Month of Stickiness 
Considering a high level of employee churn, multiple regressionbased model to estimate months of stickiness (or job with a new employer) at the time of recruitment based on candidate attributes. 

Human Resources 
Resource Demand 
Causal Forecasting for Demand estimation for each of the technical skills. The level of bench in most of the big IT services providers is an important level to get project & deliver but also add to the cost. An accurate estimation of demand by skills could be important measures to manage requirements at the right cost. 

Real Estate 
House Price Prediction 
Predicting House Prices considering house, locality, and builder characteristics. 


Real Estate 
House Demand Forecast 
Developing a forecasting model to find the volume of houses on sales in a month given economic factors, seasonality, and other dimensions 
Banking/Financial Services 
Customer Value Estimation 
Considering customer level attributes, estimating customer value. 

Banking/Financial Services 
Spend Value at a Customer 
Spend on Credit Card is a strong indicator of customer engagement on the card and whether a credit card is a front of the wallet card. Predicting the Spend value of cardholders could help the product and marketing teams in engaging the customers with an appropriate treatment strategy. 

Banking/Financial Services 
Balance In Flow into Transaction or Saving Account 
Predicting the amount of balance expected to be deposited into customers’ transaction and saving account using customer level characteristics. 

Banking/Financial Services 
Drivers of Account Open Volume 
Building Marketing or Media Mix Model to find economic, advertisement spend (across media or channels), competitor and offer related variables impacting new account open volume in a week 

Banking/Financial Services 
Portfolio Loss Forecasting 
In portfolio risk estimation, Loan Over Line Equivalent Concept is estimated using Multiple Regression Framework and Account Variables such as Account Line and Outstanding Balance at observation, and Economic Factors are used as independent variables. Reference: https://www.philadelphiafed.org//media/researchanddata/publications/workingpapers/2014/wp1410.pdf 

Insurance/Financial Services 
Claim Amount Estimation 
Insurance providers charge a premium based on the estimated claim amount for the target group of the customers along with other factors. The claim could be against Motor, Home or Pet Policy. Also, the estimated claim amount could be used for operational cash reserve calculations.
https://www.casact.org/pubs/proceed/proceed87/87354.pdf 

Healthcare/Insurance 
Healthcare Cost 
The healthcare cost of an individual to healthcare insurer using previous claim history, demographic and other data available about the individual 

Retailer /CPG 
Sales Volume and Return on Investment Modeling 
Finding out drivers of retail product sales as a function of spend across media channels, economic factors, and competitor actions 

Bank 
Revenue Regression Model 
Predicting revenue of customers and identifying parameters that are linked to increased revenue of the customers. This helps business bankers in realigning the priority and focus. 
Overall Approach of Regression Model Development
Multiple Regression Algorithm: Concepts
A Multiple Regression Problem formulations is of the following form
Y = B0 + B1* X1 + B2 * X2 + B3 * X3…. + Error
Y is Target or Dependent Variable
X1, X2, and X3 etc are set of independent variables or features
B0 is an intercept and B1, B2 and B3 are coefficients for each of the independent variables.
The main aim of the model is to find the values of these parameter estimates. The method used for estimating parameter values is Ordinary Least Square (OLS). The method aims to find the values of these parameters such as that the overall error of the model is minimized.
One of the simplest examples of Multiple Regression is Simple Linear Regression in which only one independent Variable is considered and the form will be
Now, we explaining the detailed steps to find values of intercept (B0) and B1, parameter coefficient for X1 variable.
Parameter Estimation in Simple Regression
Review Multiple Regression Output
Most of the analytical tools (such as Python, SAS, R, and SPSS) gives similar output for a regression model.
A regression model output typically will have 3 parts in the output.
One of the key performance statistics for a regression model is R2 indicating % of variance explained by the model. But R2 only increases if you are adding more variables, so Adj R2 is evaluated to not select the complicated model.
The main objective of modeling is to find parameter estimates. Based on T Statistics and PValue, the variable significance is evaluated. PValue indicates evidence in favor of the null hypothesis. In the regression model, the null hypothesis is "Beta Coefficient or Parameter Estimate for a variable is Zero". So the lower Pvalue indicates, the variable can be kept in the model.
Model Selection