HW 5

ISE 530 Optimization for Analytics Homework V: An AMPL exercise. Due 11:59 PM Wednesday October 21, 2020.

The spreadsheet of crime dataset has 50 rows (number of data points) and 7 columns. The first two columns are the observed crime rates to be predicted from the features in the last 5 columns. Formulate a least-squares regression problem with a regularizer term as the following quadratic program with the columns of the matrix A being the five features from the dataset and y is the 50-dimensional vector of observed crime rates from the first column:

Tikhonov-regularized LASSO Regression with \(\lambda = 1\):

(246)\[\begin{align} \underset{x\in\mathcal{R}^n}{\text{minimize }} &\frac{1}{2}(y-Ax)^\top (y-Ax) + \sum^n_{i=1} \vert x_i \vert \\ \end{align}\]

So the second column the dataset is not used. Solve the problem using AMPL.

%load_ext autotime
%load_ext nb_black
%matplotlib inline

import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

plt.rcParams["figure.dpi"] = 300
plt.rcParams["figure.figsize"] = (16, 12)

import pandas as pd
import numpy as np
import cvxpy as cp
import scipy as sp
from scipy import optimize
import functools

crime = pd.read_table(
    "./data/crime.txt",
    header=None,
    names=["y_1", "y_2", "x_1", "x_2", "x_3", "x_4", "x_5"],
)
crime.head()
y_1 y_2 x_1 x_2 x_3 x_4 x_5
0 478 184 40 74 11 31 20
1 494 213 32 72 11 43 18
2 643 347 57 70 18 16 16
3 341 565 31 71 11 25 19
4 773 327 67 72 9 29 24
(247)\[\begin{align} \underset{x\in\mathcal{R}^n}{\text{minimize }} &\frac{1}{2}(y-Ax)^\top (y-Ax) + \sum^n_{i=1} \vert x_i \vert \\ \end{align}\]
loss_fn = lambda X, Y, β: 0.5 * (cp.norm2(Y - X @ β) ** 2)
regularizer = lambda β: cp.norm1(β)
objective_fn = lambda X, Y, β, λ: loss_fn(X, Y, β) + λ * regularizer(β)
mse = lambda X, Y, β: (1.0 / X.shape[0]) * loss_fn(X, Y, β).value
features = ["x_1", "x_2", "x_3", "x_4", "x_5"]
response = ["y_1"]
β = cp.Variable((len(features), 1))
λ = 1
y, A, x = crime[response].values, crime[features].values, β
prob = cp.Problem(cp.Minimize(objective_fn(X=A, Y=y, β=x, λ=λ)))
prob.solve()

print("Tikhonov-regularized LASSO Regression with λ=1 Solution")
print("=" * 40)
print(f"Status: {prob.status}")
print(f"The optimal value is: {np.round(prob.value, 2)}")
print(f"The optimal solution is: x = {[np.round(β_i, 2) for β_i in β.value.flatten()]}")
Tikhonov-regularized LASSO Regression with λ=1 Solution
========================================
Status: optimal_inaccurate
The optimal value is: 1445051.55
The optimal solution is: x = [11.2, -0.16, 13.8, 3.65, -1.45]
time: 19.6 ms

Tikhonov-regularized Lasso Regression to Quadratic Program

From:

(248)\[\begin{align} \underset{x\in\mathcal{R}^n}{\text{minimize }} &\frac{1}{2}(y-Ax)^\top (y-Ax) + \sum^n_{i=1} \vert x_i \vert \\ \end{align}\]

Let’s set \(x = x^+ - x^-\), and so \(\vert x \vert = x^+ + x^-\), for \(x^{\pm} \geq 0\):

(249)\[\begin{align} \underset{x^\pm \in \mathcal{R}^n}{\text{minimize }} &\frac{1}{2}(y-A(x^+ - x^-))^\top (y-A(x^+ - x^-)) + \sum^n_{i=1} (x^+_i + x^-_i) \\ \text{subject to } &x^\pm \geq 0 \\ \end{align}\]

Expanding:

(250)\[\begin{align} \underset{x^\pm \in \mathcal{R}^n}{\text{minimize }} &\frac{1}{2}(y^\top-A^\top(x^+ - x^-)^\top)(y-A(x^+ - x^-)) + \sum^n_{i=1} (x^+_i + x^-_i) \\ \text{subject to } &x^\pm \geq 0 \\ \newline \\ \underset{x^\pm \in \mathcal{R}^n}{\text{minimize }} &\frac{1}{2}(y^\top y - y^\top A(x^+ - x^-) - y^\top A(x^+ - x^-) + (x^+ - x^-)^\top (A^\top A) (x^+ - x^-)) + \sum^n_{i=1} (x^+_i + x^-_i) \\ \text{subject to } &x^\pm \geq 0 \\ \newline \\ \underset{x^\pm \in \mathcal{R}^n}{\text{minimize }} &\frac{1}{2}(x^+ - x^-)^\top (A^\top A) (x^+ - x^-) - y^\top A(x^+ - x^-) + \sum^n_{i=1} (x^+_i + x^-_i) \\ \text{subject to } &x^\pm \geq 0 \\ \newline \\ \underset{x^\pm \in \mathcal{R}^n}{\text{minimize }} &\frac{1}{2}(x^+ - x^-)^\top (A^\top A) (x^+ - x^-) - y^\top A(x^+ - x^-) + \mathbb{1}^\top(x^+ + x^-) \\ \text{subject to } &x^\pm \geq 0 \\ \end{align}\]
x = cp.Variable((len(features), 2))
prob = cp.Problem(
    cp.Minimize(
        0.5 * cp.quad_form(x[:, 0] - x[:, 1], A.T @ A)
        - (y.T @ A) @ (x[:, 0] - x[:, 1])
        + np.ones(len(features)) @ (x[:, 0] + x[:, 1])
    ),
    constraints=[x >= 0],
)
prob.solve()

print("Tikhonov-regularized LASSO Regression Quadratic Program with λ=1 Solution")
print("=" * 40)
print(f"Status: {prob.status}")
print(f"The optimal value is: {np.round(prob.value, 2)}")
print(f"The optimal solution is: x = {[np.round(β_i, 2) for β_i in β.value.flatten()]}")
Tikhonov-regularized LASSO Regression Quadratic Program with λ=1 Solution
========================================
Status: optimal
The optimal value is: -13558345.15
The optimal solution is: x = [11.2, -0.16, 13.8, 3.65, -1.45]
time: 14.5 ms