Experimental design, discrete choice tools and interfaces to bayesian statistical software - An illustration.
Discrete Choice Experiments, a.k.a. choice based conjoint
Statistical experimental design (within a web survey)
Decision theory - Discrete choice demand theory
Econometrics - Discrete choice models - MNL, Mixed logit, …, in a panel data setting
Observational data - results from consumer behavior in the marketplace
Prices result from market equilibrium - price is typically endogenous
Explanatory variables have little variability in the market.
Explanatory variables are collinear in the market.
Abundant data (in IT systems).
Complex statistical models
Stated preferences - survey data
Price variation is constructed by analyst - price is exogenous by construction.
Can calculate demand for new products or products with new features
Products for which there is no market.
Relatively inexpensive data
Relatively simple models (hopefully)
Relatively complex survey design
8 attributes, each with 2 ou 3 levels.
1 attributes - fees with 6 levels
Attribute | Levels |
---|---|
Ranking | Finantial Times top 20; Finantial Times top 100; Not ranked |
Generic/specific levels | Generic (Econonics/Management); MSc Finance; MSc Marketing |
Attribute | Levels |
---|---|
Duration levels | 1 year equivalent; 2 years equivalent |
Full/Part-time | Full-time; Part-time evenings; Part-time weekends |
International accreditation | Yes; No |
Internship | Yes; No |
Merit scholarship | Available; Not available |
Attribute | Levels |
---|---|
job prospects | 100 % employed upon termination; 80 % employed after 3 months; < 80 % employed after 6 months |
Tuition fee | 5 000; 9 000; 13 000; 17 000; 21 000; 25 000 EUR |
A total of \(3x3x2x3x2x2x3x6=7776\) distinct possibilities
If we combine them into groups of 3 we have a total of 78,333,933,600 choice sets
There were only 24 questions !!!
Many options
Here - done by hand !!! (using Kuhfeld’s / Sloane’s orthogonal arrays libraries)
In Stata
Other specific software for DCE - eg NGene
Other generic software for statistical experiments
Consumer \(n\) has utility for product \(i\) given by \(U_{ni}\)
\[U_{ni} =U(x_i,p_i,v_n)\]
where \(x_i\) are product characteristics, \(p_i\) is the price, \(v_n\) are parameters that characterize consumer preferences.
\[U_{ni} = - p_i\alpha_n +x_i\beta_n +\varepsilon_{ni}\]
Here
\[v_n=(\alpha_n,\beta_n,\varepsilon_{ni})\]
In the simplest case \(\alpha_n=\alpha\), \(\beta_n=\beta\) and \(\varepsilon_{ni}\) has a type I (Gumbel) extreme value distribution.
The consumer chooses product \(i\) with maximum utility. The probability of doing so is:
\[s_i=\frac{\exp(V_i)}{\sum_k \exp(V_k)}\]
where
\[V_i=- p_i\alpha +x_i\beta\]
global xvars "out_opt datt1_top100 datt1_nrank datt2_fin datt2_mark datt3_2y datt4_ptev datt4_ptwe datt5_noia datt6_noint datt7_noms datt8_emp80 datt8_empl80"
clogit choice_m $xvars fees, group(gid)
est sto mnl_2018_2023
local cmd "nlcom"
foreach v in $xvars {
local cmd "`cmd' (`v':-_b[`v']/_b[fees])"
}
`cmd', post
est sto wtp_2018_2023
62 students in total.
----------------------------------------
Variable | mnl wtp
-------------+--------------------------
choice_m |
out_opt | -2.751*** -72.834***
datt1_top100 | -0.342*** -9.063**
datt1_nrank | -1.072*** -28.385***
datt2_fin | -0.709*** -18.763***
datt2_mark | -1.238*** -32.761***
datt3_2y | -0.433*** -11.451***
datt4_ptev | 0.092 2.441
datt4_ptwe | 0.081 2.136
datt5_noia | -0.301*** -7.978***
datt6_noint | -0.320*** -8.483***
datt7_noms | -0.112 -2.954
datt8_emp80 | -0.873*** -23.100***
datt8_empl80 | -1.237*** -32.754***
fees | -0.038***
-------------+--------------------------
Use stata interface to python to run models in stan.
Only a few lines of code change.
model {
array[NRES,NCHO] vector[NALT] V;
// Utilities
for (n in 1:NOBS){
V[r_id[n],c_id[n],a_id[n]] = x[n]*b_n[r_id[n]] ;
}
// LogL
for (i in 1:NRES){
for (j in 1:NCHO){
target+=categorical_logit_lpmf(y[i,j]|V[i,j]);
}
}
//Priors
b_m ~ normal(0,10.0); //hyperprior
b_s ~ cauchy(0,2.5); //hyperprior
for (i in 1:NRES){
eta_i[i] ~ normal(0,1);
}
}
python
xvars=["out_opt","datt1_top100","datt1_nrank","datt2_fin","datt2_mark","datt3_2y","datt4_ptev","datt4_ptwe","datt5_noia","datt6_noint","datt7_noms","datt8_emp80","datt8_empl80","fees"]
y=np.asarray(Data.get("choice_m"))
x=np.asarray(Data.get(xvars))
ids=np.asarray(Data.get(["idy","chset"]))
alt=np.asarray(Data.get("alt"))
end
Thank you !