This notes summarize the concept and meaning of Robinson’s constraint qualification (RCQ) in convex optimization. Briefly speaking, RCQ is constructed from the first order Taylor approximation of the feasible set (reformed as a correspondence). A notable result of RCQ is that it is a necessary and sufficent condition for the Lagrange mltiplier set to be proper.

Reference: NUS MA6253 Conic Programming Lecture Notes of Sun Defeng.


1. Metric Projection

Def. (Metric Projection) For any set $K$ that is closed and convex in $H$ and for any $z\in H$, let

\[\Pi_K(z) = \arg\min_{y\in K} \|y-z\|\]

This optimization problem has a unique solution.

Note. For all $d\in K$ we have $\langle z - \Pi_K(z),d-\Pi_K(z)\rangle \le 0$.

Proof.

For any $d\in K$, letting $y_t=(1-t)\Pi_K(z)+td$ for any $t\in[0,1]$, then

\[\|z-y_t\|^2\ge\|z-\Pi_K(z)\|^2\]

since $y_t\in K$ by the convexity of $K$. Thus,

\[\|(1-t)(z-\Pi_K(z))+t(z-d)\|^2 \ge \|z-\Pi_K(z)\|\] \[\Rightarrow (t^2-2t)\|z-\Pi_K(z)\|^2+(1-t)t\langle z-\Pi_K(z),z-d \rangle+t^2\|z-d\|^2 \ge 0\] \[\Rightarrow (t-2)\|z-\Pi_K(z)\|^2+(1-t)\langle z-\Pi_K(z),z-d \rangle+t\|z-d\|^2 \ge 0\]

Letting $t\to0$ and we have

\[- 2\langle z-\Pi_K(z),z-\Pi_K(z)\rangle + \langle z-\Pi_K(z),z-d \rangle \ge 0\] \[\Rightarrow \langle z-\Pi_K(z), \Pi_K(z)-d\rangle \ge \|z-\Pi_K(z)\| \ge 0\]

Note. If $K$ is a cone, then for any $d\in K$ we have

\[\begin{aligned} &\langle z-\Pi_K(z),\Pi_K(z)\rangle = 0\\ &\langle z-\Pi_K(z),d\rangle\le 0 \end{aligned}\]

The first equality is by $0\in K$ and $2\Pi_K(z)\in K$. The second inequality is by $d+\Pi_K(z)\in K$.


2. Relative Interior

Def. (Relative Interior) For any convex set $\mathcal{C}\subset R^n$, the relative interior of $\mathcal{C}$ denoted by $\text{ri}(\mathcal{C})$ is defined by

\[\text{ri}(\mathcal{C}) = \{x\in \text{aff}(\mathcal{C}) \ |\ \exists \epsilon > 0\text{ such that }B_{\epsilon}(x)\cap \text{aff}(\mathcal{C}) \subset \mathcal{C} \}\]

where $\text{aff}(\mathcal{C})$ is the affine hull of $\mathcal{C}$.

Lemma 1. Let $\mathcal{C}$ be a convex set in $R^n$, let $x\in\text{ri}(\mathcal{C})$ and $y\in cl(\mathcal{C})$. Then $(1-\alpha)x + \alpha y \in \text{ri}(\mathcal{C})$ for all $\alpha\in[0,1)$.

Proof.

For any given $\alpha \in [0,1)$, denote by $m \equiv (1-\alpha)x + \alpha y$. We want to show that $B_\epsilon(m)\cap \text{aff}(\mathcal{C})\subset \mathcal{C}$ for some $\epsilon>0$. Since $x\in \text{ri}(\mathcal{C})$ we know there exists $\epsilon_x > 0$ such that $B_{\epsilon_x}(x)\cap\text{aff}(\mathcal{C}) \subset \mathcal{C}$. For all $x’ \in B_{\epsilon_x}(x)$ we have $(1-\alpha)x’ + \alpha y \in cl(\mathcal{C})$ by the convexity of $\mathcal{C}$, which implies $B_{(1-\alpha)\epsilon_x}(m) \subset cl(\mathcal{C})$. Since $\alpha < 1$ we have $(1-\alpha) \epsilon_x > 0$, and we simply pick $\epsilon = (1-\alpha) \epsilon_x / 2 > 0$ and $B_{\epsilon}(m) \subset \mathcal{C}$. Finally, $B_{\epsilon}(m) \cap \text{aff}(\mathcal{C}) \subset \mathcal{C}$. END.

Lemma 2. For any convex set $\mathcal{C}$ in $R^n$, we have $cl(\text{ri}(\mathcal{C})) = cl(\mathcal{C})$ and $\text{ri}(cl(\mathcal{C})) = \text{ri}(\mathcal{C})$.

Proof.

(1) All we need to show is that $cl(\mathcal{C}) \subset cl(\text{ri}(\mathcal{C}))$. For any $y\in cl(\mathcal{C})$, pick $x\in ri(\mathcal{C})$. By Lemma 1 we have

\[y_n \equiv (1-\alpha_n) x + \alpha_n y \in ri(\mathcal{C})\]

where $a_n \in (0,1]$ and $a_n \to 1$. It is obvious that $y_n \to y$. Thus, $y \in cl(\text{ri}(\mathcal{C}))$.

(2) All we need to show is that $\text{ri}(cl(\mathcal{C})) \subset \text{ri}(\mathcal{C})$. For any $y\in \text{ri}(cl(\mathcal{C}))$, we have $\epsilon_y > 0$ such that $B_{\epsilon_y}(y) \cap \text{aff}(cl(\mathcal{C})) \subset cl(\mathcal{C})$. Firstly, we want to show that $\text{aff}(cl(\mathcal{C})) = \text{aff}(\mathcal{C})$ by showing that $\text{aff}(cl(\mathcal{C})) \subset \text{aff}(\mathcal{C})$. for any $y \in \text{aff}(cl(\mathcal{C}))$ we can pick two points $x, z \in cl(\mathcal{C})$ such that $y = \alpha x + (1-\alpha)z$. Then we pick two sequences $(x_n)\in \mathcal{C}$ and $(z_n) \subset \mathcal{C}$ such that $x_n\to x$ and $z_n\to z$, and define $y_n\equiv \alpha x_n + (1-\alpha) z_n \in \text{aff}(\mathcal{C})$. Then $y_n \to y$. By the completeness of $\text{aff}(\mathcal{C})$ we have $y\in \text{aff}(\mathcal{C})$. Secondly, by selecting $\epsilon= \epsilon_y / 2$ and we have $B_\epsilon(y) \subset \mathcal{C}$. Finally, we conclude that $B_\epsilon(y) \cap \text{aff}(\mathcal{C}) \subset \mathcal{C}$, and then $y\in \text{ri}(\mathcal{C})$.


3. Fréchet Differentiation

Def. (Fréchet Differentiation) A function $f:\mathcal{X}\to\mathcal{Y}$ is (Fréchet) differentiable at $x\in\mathcal{X}$ if there exists a linear operator $f’:\mathcal{X}\to\mathcal{Y}$ such that for any $h\to 0$ we have

\[f(x+h)-f(x)-f'(x)h = o(\|h\|)\]

Let $\nabla f(x)$ be the adjoint of $f’(x)$.

Lemma 3. (Regularity of Fréchet differential) If the Fréchet derivative of $h:\mathcal{X}\to\mathcal{Y}$ at $\bar{x}$ is onto, then there exists a $M>0$ such that for all $y\in\mathcal{Y}$ there exists a $d\in\mathcal{X}$ with $y = h’(\bar{x})d$ and $\Vert d \Vert \le M\Vert y \Vert$.

Proof.

The condition that $h$ is onto is equivalent to that $0 \in \text{int}(h’(\bar{x})\mathcal{X}) = \text{int}(\text{range}(h’(\bar{x})))$. By Proposition 2 of Perturbation of Correspondence we have $x\in \text{null}(h’(\bar{x}))$,

\[0 \in \text{int}(h'(\bar{x})(x+rB)),\quad\text{ for all } r\in[0,1]\]

Equivalently, there exists $\eta>0$ such that

\[B(\eta) \subset h'(\bar{x})(x+rB),\quad\text{ for all } r\in[0,1]\]

Thus, for all $y\in\mathcal{Y}$ there exists $v$ with $|v|\le 1$ and

\[\frac{y}{\|y\|}\eta = h'(\bar{x})(x + v) \Rightarrow y = h'(\bar{x})\left[\frac{(x+v)\|y\|}{\eta}\right]\equiv h'(\bar{x})(d_1)\]

Then it is easy to verify that there exists $M>0$ such that $\Vert d \Vert \le M\Vert y \Vert$.


4. liminf and limsup of Correspondence

Def. (liminf and limsup) The upper and lower limits of a set map (parameterized family) $A_t\subset\mathcal{Y}$ are

\[\limsup_{t\to t_0} A_t = \{y\in\mathcal{Y} : \exists t_k\to t_0, \exists y_k\in A_{t_k} s.t. y_k\to y\}\] \[\liminf_{t\to t_0} A_t = \{y\in\mathcal{Y} : \forall t_k\to t_0, \exists y_k\in A_{t_k} s.t. y_k\to y\}\]

Lemma 4. (Closeness of liminf and limsup) $\limsup_{t\to t_0} A_t$ and $\liminf_{t\to t_0} A_t$ are closed.

Proof.

Denote by $\bar{A_t}(t_0)\equiv\limsup_{t\to t_0} A_t$ and $\underline{A_{t}}(t_0)\equiv\liminf_{t\to t_0} A_t$.

(1) For any sequence $(a_n)\subset\bar{A_t}(t_0)$ with $a_n\to a$, since $a_n\in \bar{A_t}(t_0)$ we know $\exists (t^n_k)$ with $t^n_k\to t_0$ s.t. $\exists y^n_k\in A_{t^n_k}$ with $y^n_k\to a_n$. For ant fixed $n$, define

\[k_n = \inf\left\{k\in N : \|y^n_k-a_n\| < \frac{1}{n} \text{ and } \|t^n_k-t_0\| <\frac{1}{n}\right\}\]

Let $t_n = t^n_{k_n}$ and $y_n = y^n_{k_n} \in A_{t_n}$. We have $t_n\to t_0$ and

\[\|y_n - a\| \le \|y_n-a_n\| + \|a_n-a\| \to 0\]

Thus, $a\in\bar{A_t}(t_0)$ and $\bar{A_t}(t_0)$ is closed.

(2) For any sequence $(a_n)\subset \underline{A_{t}}(t_0)$ with $a_n\to a$ and for any sequence $t_k\to t_0$, since $a_n\in\underline{A_{t}}(t_0)$ we know $\exists y^n_k\in A_{t_k}$ with $y^n_k\to a_n$. Let $k_1 = 1$. For $n\ge2$, define a sequence $(k_n)$ by

\[k_n = \inf\left\{k > k_{n-1} : \|y^n_k-a_n\| < \frac{1}{n}\text{ and }\|y^n_{k'}-y^n_k\|<\frac{1}{n} \text{ for all } k'\ge k\right\}\]

It’s obviously that $(k_n)$ is increasing. Define $(y_k)$ in the following recursive way:

Step 1. Let $n=1$.

Step 2. For all $k_n \le k < k_{n+1}$, let $y_k = y^n_k$. Otherwise, go to step 3.

Step 3. let $n$ be $n+1$. Go to Step 2.

It is obviously that $y_k \in A_{t_k}$ for all $k$. Now we show that $y_k\to a$. For any $k_{n} \le k < k_{n+1}$ we have

\[\begin{aligned} \|y_k - a\| \le \|y_k - y_{k_{n}}\| + \|y_{k_{n}} -a_{n}\| + \|a_{n} - a\| \le \frac{1}{n} + \frac{1}{n} + \|a_{n} - a\| \to 0 \end{aligned}\]

Thus, we conclude that $a\in\underline{A_{t}}(t_0)$ and $\underline{A_{t}}(t_0)$ is closed.


5. Tangent Cones

Def. (Tangent Cones) For any closed set $D\subset \mathcal{Y}$ and a point $y\in D$ we define

(1) Radial cone

\[R_D(y) = \{d\in\mathcal{Y}\ :\ \exists t^*>0 \text{ s.t. } y+td\in D, \forall t\in [0,t^*]\ \}\]

(2) Inner tangent cone

\[T^i_D(y) \equiv \liminf_{t\downarrow 0}\frac{D-y}{t}\]

(3) Contingent (Bouligand) cone

\[T_D(y) \equiv \limsup_{t\downarrow 0}\frac{D-y}{t}\]

(4) Clarke tangent cone

\[T^c_D(y) = \liminf_{t\downarrow 0,\ y'\to y}\frac{D-y'}{t}\]

Lemma 5. (Sequential Representations of Tangent Cones) For inner tangent cone and Bouligand cone, we have the following sequential representations

\[T^i_D(y) \equiv \liminf_{t\downarrow 0}\frac{D-y}{t} = \{d\in \mathcal{Y} : \forall t_k\downarrow0, D_\mathcal{Y}(y+t_kd,D) = o(t_k)\}\] \[T_D(y) \equiv \limsup_{t\downarrow 0}\frac{D-y}{t} = \{d\in\mathcal{Y}:\exists t_k\downarrow0, D_\mathcal{Y}(y+t_kd,D)=o(t_k)\}\]

Proof.

We only show Bouligand tangent cone, the proof of the other one is similar.

”$\subset$”. By definition of “limsup” we know for all $d\in T_D(y)$, $\exists$ $(t_k)$ with $t_k\downarrow 0$ and $\exists d_k\in \frac{D-y}{t_k}$ such that $d_k\to d$, meaning that $y+t_kd_k\in D$ with $d_k\to d$. Thus, we have

\[D_\mathcal{Y}(y+t_kd, D) \le \|(y+t_k d)-(y+t_kd_k)\| = t_k\|d_k-d\|\]

Thus, we conclude that $D_\mathcal{Y}(y+t_kd, D) = o(t_k)$.

”$\supset$”. For all $d\in\mathcal{Y}$ such that $\exists (t_k)$ with $t_k\downarrow 0$ and $D_\mathcal{Y}(y+t_kd,D)=o(t_k)$, meaning that $\exists (x_k)\subset D$ with

\[\|y+t_kd-x_k\| = o(t_k)\]

Let $d_k = \frac{x_k-y}{t_k}\in\frac{D-y}{t_k}$ and we have

\[t_k\cdot \|d-d_k\| = o(t_k)\]

which implies $d_k\to d$. Thus, $d\in T_D(y)$.

Lemma 6. (Relations among Tangent Cones 1) From the definitions above we directly have

(1) $T^i_D(y)$, $T_D(y)$ and $T^c_D(y)$ are all closed, and

(2) $R_D(y)\subset T^i_D(y)\subset T_D(y)$.

Proof.

(1) By the property of “limsup” and “liminf” (see Lemma 4).

(2) It is obviously that $T^i_D(y)\subset T_D(y)$. Now we want to show that $R_D(y)\subset T^i_D(y)$. For any $d\in R_D(y)$, by definition we know there exists a $t^\star>0$ such that $y+td\in D$ for all $t\in [0,t^\star]$. For any $(t_k)$ with $t_k\downarrow 0$, there exists $K>0$ such that $t_k < t^\star$ when $k>K$. Then we have $y+t_kd\in D$, and $D_\mathcal{Y}(y+t_kd,D) = 0 = o(t_k)$. Thus, we conclude that $d\in T^i_D(y)$.

Lemma 7. If $y\not\in D$ then $ R_D(y) = \emptyset$. Same for $T^i_D(y)$ and $T_D(y)$ (directly from the sequential representation).

Lemma 8. (Relations among Tangent Cones 2) If $D\subset\mathcal{Y}$ is closed and convex and $y\in D$, then we have

(1) $R_D(y) = \cup_{t>0}(D-y)/t$.

(2) $T_D(y) = T^i_D(y) = cl(R_D(y))$.

Proof.

(1) The “$\subset$” direction is straightforward. Now we want to show the “$\supset$” direction. For any $d\in \cup_{t>0}(D-y)/t$ there exists $t^\star>0$ and $s\in D$ such that $d = (s-y)/t^\star$. Then for all $t\in[0,t^\star]$ we have

\[y + td = y + t\cdot\frac{s-y}{t^\star} = (1-t/t^\star) \cdot y + t/t^\star \cdot s \in D\]

which implies that $d\in R_D(y)$.

(2) First we have $R_D(y)\subset T^i_D(y)\subset T_D(y)$ by Lemma 6 and $T^i_D(y), T_D(y)$ are closed by Lemma 6. Thus we have

\[cl(R_D(y))\subset T^i_D(y)\subset T_D(y)\]

Conversely, for any $d\in T_D(y)$, by definition there exists $(t_k)\downarrow 0$ and $(y_k)\subset D$ such that $(y_k-y)/t_k\to d$. Since $(y_k-y)/t_k\in R_D(y)$ by the conclusion of Part (1) we have $d\in cl(R_D(y))$.

Def. (Normal Cone) For any closed $D\subset\mathcal{Y}$, the normal cone to $D$ at $y\in\mathcal{Y}$ is defined by

\[\mathcal{N}_D(y) = (T_D(y))^{\circ} = \{x\in\mathcal{Y} : \langle x,z \rangle \le 0\text{ for all } z\in T_D(y)\}\]

i.e., the polar cone of Bouligand cone of $D$ at $y$.

Lemma 9. $N_D(y)$ is closed since polar cone is always closed.

Lemma 10. $N_D(y) = \emptyset$ if $y\not\in D$ since polar cone of empty set is empty set. (There are different definitions of polar of empty set and here we follow Sun’s notes.)

Lemma 11. If $D$ is closed and convex and $y\in D$, remember we have $R_D(y) = \cup_{t>0}(D-y)/t$ and then

\[\begin{aligned} \mathcal{N}_D(y) &= (T_D(y))^\circ\\ &= (cl(R_D(y)))^\circ\\ &= (R_D(y))^\circ\\ &= \{u\in\mathcal{Y} : \langle u,d-y \rangle \le 0,\forall d\in D\} \end{aligned}\]


6. The Optimization Problem

Def. (Optimization Problem, OP) Suppose $f, g, h$ are $C^1$ (in Fréchet differentiable sense) and $\mathcal{C}$ is a closed and convex set. Then the optimization problem (OP) is defined by

\[\begin{aligned} \min_{x\in\mathcal{X}}\quad &f(x)\\ s.t.\quad &h(x) = \{0\}^m\\ &g(x) \in \mathcal{C} \end{aligned} \tag{OP}\]

Def. (Compact Optimization Problem, COP) As a generalization [by letting $G = (h,g)$ and $\mathcal{K}={0}^m\times\mathcal{C}$ ], the compact form of OP (COP) is defined as

\[\begin{aligned} \min_{x\in\mathcal{X}}\quad &f(x) \\ \text{s.t. }\quad &G(x)\in\mathcal{K} \end{aligned} \tag{COP}\]

where $f:\mathcal{X}\to R$ and $G:\mathcal{X}\to\mathcal{Y}$ are $C^1$ (in Fréchet differentiable sense) and $\mathcal{K}\subset\mathcal{Y}$ is a closed and convex set. Let

\[\mathcal{F} = G^{-1}(\mathcal{K})\]

be the feasible set and

\[L(x,\mu) = f(x)-\langle \mu,G(x)\rangle\]

be the Lagrangian, where $\mu\in\mathcal{Y}^*$.

Def. (Lagrange Multiplier and KKT Conditions) We say $\bar{\mu}$ is a Lagrange multiplier of COP at feasible point $\bar{x}$ if it satisfies the KKT conditions

\[\nabla_xL(\bar{x},\bar{\mu}) = 0\quad\text{ and }\quad 0\in\bar{\mu}+N_\mathcal{K}(G(\bar{x})) \tag{KKT}\]

Lemma 12. (KKT Conditions when $\mathcal{K}$ is a Cone) If $\mathcal{K}$ is a closed and convex cone, then the KKT condition above is equivalent to

\[\nabla_xL(\bar{x},\bar{\mu}) = 0, \quad G(\bar{x})\in\mathcal{K}, \quad \bar{\mu}\in\mathcal{K}^\star \quad\text{and}\quad \langle \bar{\mu},G(\bar{x})\rangle=0\]

Proof.

Since $\bar{x}$ is feasible we have $G(\bar{x})\in\mathcal{K}$. Thus, by Lemma 11 we have

\[\begin{aligned} \mathcal{N}_\mathcal{K}(G(\bar{x})) = \{d : \langle d,u-G(\bar{x}) \le 0 \rangle,\forall u\in\mathcal{K}\} = \{d : \langle d,u \rangle\le\langle d,G(\bar{x})\rangle,\forall u\in\mathcal{K}\} \end{aligned}\]

Thus, $-\bar{\mu}\in N_\mathcal{K}(G(\bar{x}))$ implies $\langle \bar{\mu},G(\bar{x}) \rangle \le \langle \bar{\mu},u \rangle$, $\forall u\in\mathcal{K}$. Since $\mathcal{K}$ is a close convex cone, we have $0\in\mathcal{K}$ and $2G(\bar{x})\in\mathcal{K}$, thus

\[\langle \bar{\mu},G(\bar{x}) \rangle \le \langle \bar{\mu},0 \rangle = 0 \quad\text{ and }\quad \langle \bar{\mu},G(\bar{x}) \rangle \le \langle \bar{\mu},2G(\bar{x}) \rangle = 2\langle \bar{\mu},G(\bar{x}) \rangle\Rightarrow \langle \bar{\mu},G(\bar{x}) \rangle\ge 0\]

Finally we conclude that $\langle \bar{\mu},G(\bar{x}) \rangle = 0$. Further more,

\[0 = \langle \bar{\mu},G(\bar{x}) \rangle \le \langle \bar{\mu},u \rangle,\quad\forall u\in\mathcal{K}\]

implies that $\bar{\mu}\in\mathcal{K}^\star$.

Example 1. (Bouligand cone of OP) Let $\mathcal{K} = {0}^m\times R^q_+$ and

\[G(x)=\begin{bmatrix}h(x) \\ g(x)\end{bmatrix}\]

the Bouligand cone $T_\mathcal{K}(G(\bar{x}))$ is

\[T_\mathcal{K}(G(\bar{x})) = \left\{\begin{pmatrix}d^1\\d^2\end{pmatrix}\in R^{m+q} : \exists t^k\downarrow 0 \text{ s.t. } D\left(\begin{pmatrix}h(\bar{x})+t^kd^1\\g(\bar{x})+t^kd^2\end{pmatrix},\begin{pmatrix}\{0\}^m\\R^q_+\end{pmatrix}\right) = o(t^k)\right\}\]

where $d^1\in R^m,d^2\in R^q$. Thus, for any $(d_1^T,d_2^T)^T\in T_\mathcal{K}(G(\bar{x}))$ there exists some $t^k\downarrow0$ such that

\[\frac{1}{t^k}D(h(\bar{x})+t^kd^1)\to0 \Leftrightarrow \|d^1\|\to 0 \Leftrightarrow d^1 = 0\]

and

\[\frac{1}{t^k}D(g(\bar{x})+t^kd^2,R^q_+) \to 0\]

Define $I(\bar{x})\equiv{i:g_i(\bar{x})=0}$ and we know that for all $i\in I(\bar{x})$ we have

\[\frac{1}{t^k}D(g_i(\bar{x})+t^kd_i^2,[0,\infty)) = \frac{1}{t^k}D(t^kd^2_i,[0,\infty)) = \begin{cases} 0&\text{ if }d^2_i\ge0\\ |d^2_i|&\text{ if }d^2_i<0 \end{cases}\to0 \Leftrightarrow d^2_i\ge0\]

and for all $i\not\in I(\bar{x})$ we have

\[\frac{1}{t^k}D(g_i(\bar{x})+t^kd^2_i,[0,\infty)) = \begin{cases} 0 &\text{ if }t^2_k<\min\{g_i(\bar{x}):i\not\in I(\bar{x})\}\\ O(d^2_i)&\text{ otherwise.} \end{cases} \to0\]

which holds for all $d^2_i$, $i\not\in I(\bar{x})$. Finally,

\[T_\mathcal{K}(G(\bar{x})) = \left\{\begin{pmatrix}u\\v\end{pmatrix}\in R^{m+q} : u=0,v_i\ge0\text{ if }i\in I(\bar{x}), u\in R^m,v\in R^q\right\}\]


7. Robinson’s CQ: Basic Concepts

Def. (Robinson’s CQ) Robinson’s CQ (RCQ) holds at $\bar{x}$ if

\[0\in\text{int}\{G(\bar{x})+ G'(\bar{x})\mathcal{X}-\mathcal{K}\} \tag{RCQ}\]

Lemma 13. Here $G’(x)$ is the linear operator from $\mathcal{X}$ to $\mathcal{Y}$ representing the Fréchet derivative of $G(x)$.

Lemma 14. Since $G’(x)$ is a linear operator, we have $G’(x)\mathcal{X}$ is a linear subspace of $\mathcal{Y}$.

Lemma 15. (Polar Cone Algebra) For any cones $A$ and $B$ we have

\[[A + B]^\circ = A^\circ \cap B^\circ\]

Proof.

(1) “$\subset$”. For any $d\in [A+B]^\circ$ we must have $d\in A^\circ$ since $0\in B$ and $d\in B^\circ$ since $0\in A$.

(2) “$\supset$”. For any $d\in A^\circ \cap B^\circ$ we must have $\langle d,a+b\rangle = \langle d,a\rangle + \langle d,b\rangle \le 0$ for all $a\in A$ and $b\in B$.

Proposition 1. (RCQ at feasible $\bar{x}$) Assume $G(\bar{x})\in\mathcal{K}$, then the following expressions (a)-(e) are equivalent, and are equivalent to RCQ,

\[G'(\bar{x})\mathcal{X} + R_\mathcal{K}(G(\bar{x})) = \mathcal{Y} \tag{a}\] \[G'(\bar{x})\mathcal{X}+T_\mathcal{K}(G(\bar{x})) = \mathcal{Y} \tag{b}\] \[[G'(\bar{x})\mathcal{X}]^\perp \cap \mathcal{N}_\mathcal{K}(G(\bar{x})) = \{0\} \tag{c}\] \[cl(G'(\bar{x})\mathcal{X}+T_\mathcal{K}(G(\bar{x}))) = \mathcal{Y} \tag{d}\] \[cl(G'(\bar{x})\mathcal{X}+R_\mathcal{K}(G(\bar{x}))) = \mathcal{Y} \tag{e}\]

Proof.

$(a)\Rightarrow(b)$. By Lemma 6, it’s obvious.

$(b)\Leftrightarrow(c)$. Compute the polar cone of (b) on both sides and we have

\[\{0\} = \mathcal{Y}^\circ = [G'(x)\mathcal{X}+T_\mathcal{K}(G(\bar{x}))]^\circ = [G'(x)\mathcal{X}]^\circ \cap [T_\mathcal{K}(G(\bar{x}))]^\circ = [G'(x)\mathcal{X}]^\perp \cap N_\mathcal{K}(G(\bar{x}))\]

The third equality is by the Lemma of Polar Cone Algebra. The last equality holds because $G’(x)\mathcal{X}$ is a linear subspace. is feasible.

$(c)\Rightarrow(d)$. Using (b) to derive (d) since (b) is equivalent to (c).

$(d)\Rightarrow(e)$. Given that $G(x)\in\mathcal{K}$ and $K$ is convex and closed, we calculate the polar cone of (d)​ on both sides and get

\[\begin{aligned} \{0\} &= [cl(G'(\bar{x})\mathcal{X}+T_\mathcal{K}(G(\bar{x})))]^\circ = [G'(\bar{x})\mathcal{X} + T_\mathcal{K}(G(\bar{x}))]^\circ\\ &= [G'(\bar{x})\mathcal{X}]^\circ \cap [T_\mathcal{K}(G(\bar{x}))]^\circ = [G'(\bar{x})\mathcal{X}]^\circ \cap [cl(R_\mathcal{K}(G(\bar{x})))]^\circ\\ &= [G'(\bar{x})\mathcal{X}]^\circ \cap [R_\mathcal{K}(G(\bar{x}))]^\circ = [G'(\bar{x})\mathcal{X} + R_\mathcal{K}(G(\bar{x}))]^\circ\\ &= [cl(G'(\bar{x})\mathcal{X} + R_\mathcal{K}(G(\bar{x})))]^\circ \end{aligned}\]

which implies (e). The forth equality is by Lemma 8.

$(e)\Rightarrow(a)$. If (a) fails to hold then $\text{ri}[G’(\bar{x})\mathcal{X} + R_\mathcal{K}(G(\bar{x}))] \not= \mathcal{Y}$. Then by (e) we have a contradiction,

\[\text{ri}(\mathcal{Y}) = \text{ri}[cl(G'(\bar{x})\mathcal{X} + R_\mathcal{K}(G(\bar{x})))] = \text{ri}[G'(\bar{x})\mathcal{X} + R_\mathcal{K}(G(\bar{x}))] \not= \mathcal{Y}\]

$\text{RCQ}\Rightarrow(a)$. Given that $G(x)\in\mathcal{K}$ and $\mathcal{K}$ is convex and closed, for any $y\in\mathcal{Y}$, by RCQ we know there exists some $t>0$ such that

\[-ty\in G(\bar{x})+G'(\bar{x})\mathcal{X}-\mathcal{K}\\ \Rightarrow\quad y\in G'(\bar{x})\mathcal{X}+\frac{\mathcal{K}-G(\bar{x})}{t} \subset G'(\bar{x})\mathcal{X}+\mathcal{R}_\mathcal{K}(G(\bar{x}))\]

which implies that $\mathcal{Y} \subset G’(\bar{x})\mathcal{X}+\mathcal{R}_\mathcal{K}(G(\bar{x}))$.

$(a)\Rightarrow\text{RCQ}$. Define a correspondence $\mathcal{M}:\mathcal{X}\times[0,\infty)\rightrightarrows\mathcal{Y}$ by

\[\mathcal{M}(x,t)= \begin{cases} -G'(\bar{x})x+t(\mathcal{K}-G(\bar{x})) & t\ge0\\ \emptyset & \text{otherwise} \end{cases}\]

Since $\mathcal{K}$ is convex and closed, we know that $\mathcal{M}$ is convex and closed. By (a) we have

\[\text{range}(\mathcal{M})=G'(\bar{x})\mathcal{X}+\mathcal{R}_\mathcal{K}(G(\bar{x}))=\mathcal{Y}\]

which implies

\[0 \in \text{int}(\mathcal{Y}) = \text{int}(\text{range}(\mathcal{M}))\]

By Proposition 2 of Perturbation of Correspondence, for any $(x,t)\in\mathcal{M}^{-1}(0)$ we have

\[0\in \text{int}(\mathcal{M}((x,t)+B_{\mathcal{X}\times[0,\infty)}))\]

Since $(0,0)\in\mathcal{M}^{-1}(0)$, the above condition is transformed into

\[\begin{aligned} 0 \in \text{int}(\mathcal{M}(B_{\mathcal{X}\times[0,\infty)})) &\subset \text{int}(\mathcal{M}(\mathcal{X}\times[0,1]))\\ &= \text{int}(\cup_{t\in[0,1]}\ t(\mathcal{K}-G(\bar{x}))-G'(x)\mathcal{X})\\ &= \text{int}(\mathcal{K}-G(\bar{x})-G'(x)\mathcal{X}) \end{aligned}\]

The last equality is because $\mathcal{K}-G(\bar{x})$ is convex and closed and contains the origin, thus $t(\mathcal{K}-G(\bar{x}))$ is the shrink of $\mathcal{K}-G(\bar{x})$ when $t\in[0,1]$, and thus contained by $\mathcal{K}-G(\bar{x})$.

Proposition 2. (RCQ at feasible $\bar{x}$ for solid $\mathcal{K}$) Assume $G(\bar{x})\in\mathcal{K}$ and $\mathcal{K}$ has a nonempty interior, RCQ is equivalent to $G(\bar{x})+G’(\bar{x})d\in \text{int}(\mathcal{K})$ for some $d\in\mathcal{X}$.

Proof.

$\Leftarrow$. If such a $d\in\mathcal{X}$ exists then there exists $\epsilon > 0$ such that

\[\begin{aligned} &B(G(\bar{x})+G'(\bar{x})d,\epsilon)\subset \mathcal{K}\\ \Rightarrow\quad B(&0,\epsilon) \subset G(\bar{x})+G'(\bar{x})d -\mathcal{K} \end{aligned}\]

Thus the RCQ holds.

$\Rightarrow$. Suppose that for all $d\in\mathcal{X}$ we have $G(\bar{x})+G’(\bar{x})d\not\in \text{int}(\mathcal{K})$. Then we know the sets $G(\bar{x})+G’(\bar{x})\mathcal{X}$ and $\text{int}(\mathcal{K})$ are isolated. Then by separation theorem there exists $\lambda\in\mathcal{Y}^*$ such that

\[\langle\lambda,G(\bar{x})-G'(\bar{x})d\rangle \ge \langle\lambda,k\rangle\]

for all $d\in\mathcal{X}$ and $k\in\mathcal{K}$. Thus, for any $y\in\mathcal{Y}$ such that $\langle\lambda,y\rangle<0$ we know $ty\not\in G(\bar{x})-G’(\bar{x})\mathcal{X}-\mathcal{K}$ for all $t>0$. Thus $B(0,\epsilon)\not\subset G(\bar{x})-G’(\bar{x})\mathcal{X}-\mathcal{K}$ for all $\epsilon>0$.


8. Robinson’s CQ and Other CQs

Proposition 3. (RCQ for OP) Suppose $G(x)=(h(x),g(x))$, $\mathcal{K}={0}^m\times\mathcal{C}$ and $G(\bar{x})\in\mathcal{K}$. COP becomes

\[\begin{aligned} \min\quad &f(x)\\ s.t.\quad &h(x)=0\\ &g(x)\in \mathcal{C} \end{aligned} \tag{OP}\]

where $f,h,g$ are Fréchet differentiable. If $\mathcal{C}$ is closed, convex and has a nonempty interior, then RCQ at the feasible point $\bar{x}$ is equivalent to

(A) $h’(\bar{x})$ is onto;

(B) $\exists d\in\mathcal{X}$ such that $h’(\bar{x})d=0$ and $g(\bar{x})+g’(\bar{x})d\in \text{int}(\mathcal{C})$.

Proof.

”$\Rightarrow$”. Since RCQ holds, we know there exists $\epsilon_1>0$ and $\epsilon_2>0$ such that for all $y \in B_\mathcal{Y}(\epsilon_1)$ and $z \in B_\mathcal{Y}(\epsilon_2)$, there exists $d\in\mathcal{X}$ with

(A) $y=h(\bar{x})+h’(\bar{x})d-{0}^m = h’(\bar{x})d$

(B) $z\in g(\bar{x})+g’(\bar{x})d-\mathcal{C}$

Condition (A) implies that $h’(\bar{x})$ is onto. Letting $y=0$ we have $d\in \text{null}(h’(\bar{x}))$, implying that

\[0\in \text{int}(g(\bar{x})+g'(\bar{x})(\text{null}(h'(\bar{x}))-\mathcal{C})\]

Thus, there exists $d\in\mathcal{X}$ such that $0=h’(\bar{x})d$ and $g(\bar{x})+g’(\bar{x})d\in \text{int}(\mathcal{C})$.

”$\Leftarrow$”. Since $\text{int}(\mathcal{C})\not=\emptyset$, the conditions above is equivalent to

(A) $h’(x)$ is onto,

(B) $0\in \text{int}(g(\bar{x})+g’(\bar{x})(\text{null}(h’(\bar{x}))-\mathcal{C})$.

By Lemma 3, condition (A) is equivalent to that there exists a constant $M>0$ such that for all $y\in\mathcal{Y}$ there exists a $d_1\in\mathcal{X}$ with $y = h’(\bar{x})d_1$ and $|d_1| \le M|y|$.

Condition (B) is equivalent to $B(0,\epsilon)\subset g(\bar{x})+g’(\bar{x})(\text{null}(h’(\bar{x}))-\mathcal{C}$ for some $\epsilon>0$.

RCQ holds if there exists $\epsilon_1>0$ and $\epsilon_2>0$ such that for all $y \in B_\mathcal{Y}(\epsilon_1)$ and $z \in B_\mathcal{Y}(\epsilon_2)$, there exists $d\in\mathcal{X}$ with

\[\begin{aligned} &y=h(\bar{x})+h'(\bar{x})d-\{0\}^m = h'(\bar{x})d \\ &z\in g(\bar{x})+g'(\bar{x})d-\mathcal{C} \end{aligned}\]

Let $d_2 = d-d_1$, the above conditions are transformed into finding such a $d_2$ that

\[\begin{aligned} 0 &= h'(\bar{x})d_2\\ z -g'(\bar{x})d_1 &\in g(\bar{x}) + g'(\bar{x})d_2 - \mathcal{C} \end{aligned}\]

By letting $\epsilon_1$ and $\epsilon_2$ sufficiently small we have $z-g’(\bar{x})d_1\in B(0,\epsilon)$. Thus, there exists $d_2$ satisfying the above two conditions.

Proposition 4. (RCQ for NLP: MFCQ) For the NLP below,

\[\begin{aligned} \min\quad &f(x)\\ s.t.\quad &h(x)=0\\ &g(x)\ge0 \end{aligned} \tag{NLP}\]

where $f,g,h$ are all smooth. RCQ is equivalent to MFCQ:

(A) $(\nabla h_i(\bar{x}))^m_{i=1}$ are linearly independent,

(B) $\exists \bar{d}\in\mathcal{X}$ such that $\langle \nabla h_i(\bar{x}),\bar{d}\rangle=0$, $i=1,…,m$ and $\langle\nabla g_j(\bar{x}),\bar{d}\rangle>0$, $j\in \mathcal{I}(\bar{x})$.

where $\mathcal{I}(\bar{x})$ is the index set of $g_i(.)$ that is active at $\bar{x}$.

Proof.

Let

\[G(x) = \begin{bmatrix}h(x) \\ g(x)\end{bmatrix}\in R^{m+q}\]

By Proposition 1(b) we have RCQ is equivalent to

\[G'(\bar{x})\mathcal{X}+T_\mathcal{K}(G(\bar{x})) = R^{m+q}\]

where $T_\mathcal{K}(G(\bar{x}))$ is given in Example 1. Hence, RCQ is transformed into

\[\begin{bmatrix} h'(\bar{x})\\ g'(\bar{x}) \end{bmatrix}\mathcal{X} + \left\{\begin{pmatrix}0\\v\end{pmatrix}\in R^{m+q} : v_i\ge0\text{ if }i\in I(\bar{x}), v\in R^q\right\} = R^{m+q}\]

where

\[h'(\bar{x}) = \begin{bmatrix} \nabla h_1(\bar{x})^T\\ \nabla h_2(\bar{x})^T\\ ...\\\ \nabla h_m(\bar{x})^T \end{bmatrix} \quad \text{and} \quad g'(\bar{x}) = \begin{bmatrix} \nabla g_1(\bar{x})^T\\ \nabla g_2(\bar{x})^T\\ ...\\\ \nabla g_q(\bar{x})^T \end{bmatrix}\]

“RCQ\(\Rightarrow\)MFCQ”. By RCQ, $h’(\bar{x})\mathcal{X}=R^m$ implies that $h’(\bar{x})$ is linearly independent. For the point $(0, …, 0, -1, …, -1)\in R^{m+q}$ where the first $m$ scalers are zero and remaining are -1, by RCQ there must be a $d\in\mathcal{X}$ such that $h’(\bar{x})d=0$ and $v_j\ge0$, $\nabla g_j(\bar{x})d+v_j=-1$ where $j\in I(\bar{x})$. Thus, $\nabla g_j(\bar{x})d<0$ for $j\in I(\bar{x})$. Then $\bar{d}=-d$ is what we need.

“MFCQ\(\Rightarrow\)RCQ”. Condition (A) of MFCQ implies that $h’(\bar{x})\mathcal{X}=R^m$. Suppose there exists a $\bar{d}\in\mathcal{X}$ such that condition (B) of MFCQ holds, then for any $\begin{bmatrix}y_1 \ y_2\end{bmatrix}\in R^{m+q}$ where $y_1\in R^m$ and $y_2\in R^q$, we have

(A) there exists a $d\in\mathcal{X}$ such that $h’(\bar{x})d = y_1$.

(B) there exists some $k<0$ and $v\in{v\in R^q:v_j\ge0\text{ if }j\in I(\bar{x})}$ such that $y_2 - g’(\bar{x})d = \langle g’(\bar{x}),k\bar{d}\rangle+ v$.

Thus,

\[\begin{bmatrix}y_1\\y_2\end{bmatrix} = \begin{bmatrix}h'(x)\\g'(x)\end{bmatrix}(d+k\bar{d}) + \begin{bmatrix}0\\v\end{bmatrix}\]

which implies that RCQ holds.


9. Approximation of $\Psi_G$ at $(\bar{x}, 0)$

Suppose that $\bar{x}$ is a feasible point where the Robinson’s CQ holds. Denote by $\Psi_H(x) \equiv H(x) - \mathcal{K}$ and we have

\[\begin{aligned} \text{RCQ}:\quad 0 &\in \text{int}(G(\bar{x})+G'(\bar{x})\mathcal{X}-\mathcal{K})\\ \Leftrightarrow\quad 0 &\in \text{int}(\text{range}(H)-\mathcal{K})\\ \Leftrightarrow\quad 0 &\in \text{int}(\text{range}(H-\mathcal{K}))\\ \Leftrightarrow\quad 0 &\in \text{int}(\text{range}(\Psi_H)) \end{aligned}\]

Note that $\mathcal{K}$ is closed and convex and so is $\Psi_H$. By the Proposition 3 of Perturbation of Correspondence we have $\Psi_H$ is open at $(\bar{x},0)$ at the rate of $\gamma$ for some $\gamma>0$. Then by Proposition 4 of Perturbation of Correspondence we know $\Psi_H$ is metric regular at $(\bar{x},0)$ at the rate $1/\gamma$.

Define $H(x) \equiv G(\bar{x})+G’(\bar{x})(x-\bar{x})$ the first order Taylor approximation of $G$ around $\bar{x}$. Since $G$ is (Fréchet) continuously differentiable, we know

\[A(x) \equiv |G(x)-H(x)| = o(\|x-\bar{x}\|)\]

is continuous. It is straightforward that there $A$ is locally Lipschitz, and there exists a neighborhood of $\bar{x}$ with the Lipschitz modulus $\kappa < \gamma$.

Finally, by the Proposition 5 of Perturbation of Correspondence we know $\Psi_G(x) \equiv G(x) - \mathcal{K}$ is metric regular at the point $(\bar{x},0)$ at the rate $1/(\gamma-\kappa)$.

Proposition 5. For COP, if Robinson’s CQ holds at $\bar{x}$ then there exists a neighborhood $\mathcal{N}$ of $\bar{x}$ such that for all $x\in\mathcal{N}$ we have

\[D_\mathcal{X}(x,\mathcal{F}) = O(D_\mathcal{Y}(G(x),\mathcal{K}))\]

Proof.

Since Robinson’s CQ holds at $\bar{x}$ we know $\Psi_G$ is metric regular at $(\bar{x},0)$. Then there exists a neighborhood $\mathcal{N}$ of $(\bar{x},0)$ and a $c>0$ such that

\[D_\mathcal{X}(x,\Psi_G^{-1}(y)) \le c \cdot D_\mathcal{Y}(y, \Psi_G(x))\]

for all $(x, y)\in \mathcal{N}$. Note that

\[x \in (G-\mathcal{K})^{-1}(y) \ \Leftrightarrow\ y \in G(x)-\mathcal{K} \ \Leftrightarrow\ y+\mathcal{K} \in G(x) \ \Leftrightarrow\ x \in G^{-1}(y+\mathcal{K})\]

We have

\[D_\mathcal{X}(x, G^{-1}(y+\mathcal{K})) \le c \cdot D_\mathcal{Y}(y,G(x)-\mathcal{K})\]

By letting $y = 0$ we have the result.


10. Linearization of COP

Def. (Linearization of COP) The linearization of COP at $\bar{x}\in\mathcal{X}$ is given by

\[\min_{d\in\mathcal{X}}\ f'(\bar{x})d\quad\text{s.t.}\ \ d\in T_\mathcal{F}(\bar{x}) \tag{LCOP}\]

Proposition 6. (Optimal Solution of LCOP) Let $\bar{x}$ be a locally optimal solution of COP. Then $d=0$ is an optimal solution of LCOP.

Proof.

For any $d\in T_\mathcal{F}(\bar{x})$. By the definition of Bouligand tangent cone, there exists a sequence $t^k\downarrow 0$ and $(x^k)\subset\mathcal{F}$ such that $x^k = \bar{x}+t^kd+o(t^k)$. Since $\bar{x}$ is locally optimal solution of COP we have

\[0 \le \lim_{k\to\infty}\frac{f(x^k)-f(\bar{x})}{t^k} = f'(\bar{x})d\]

Also, $0\in T_\mathcal{F}(\bar{x})$, meaning that $d=0$ is feasible.

Proposition 7. (RCQ for LCOP) Suppose $\bar{x}$ is a locally optimal solution of COP and Robinson’s CQ holds at $\bar{x}$. Then

\[T_\mathcal{F}(\bar{x}) = S(\bar{x}) \equiv \{d\in\mathcal{X} : G'(\bar{x})d\in T_\mathcal{K}(G(\bar{x}))\}\]

Proof.

“$T_\mathcal{F}(\bar{x})\subset S(\bar{x})$”. For any $d\in T_\mathcal{F}(\bar{x})$, by the definition of Bouligand cone we know there exists $t^k\downarrow 0$ and $(x^k)\subset\mathcal{F}$ with $x^k = \bar{x}+t^kd + o(t^k)$. Note that

\[x^k\in\mathcal{F} \ \Leftrightarrow\ G(x^k)\in\mathcal{K} \ \Leftrightarrow\ G(\bar{x})+G'(\bar{x})(t^kd)+o(t^k) \in\mathcal{K}\]

Thus for all $t^k$ we have

\[G'(\bar{x})d \in \frac{\mathcal{K}-G(\bar{x})}{t^k}+o(1) \ \Rightarrow\ G'(\bar{x})d\in \limsup_{k\to\infty}\frac{\mathcal{K}-G(\bar{x})}{t^k} = T_\mathcal{K}(G(\bar{x}))\]

”$\text{RCQ}:S(\bar{x})\subset T_\mathcal{F}(\bar{x})$”. For any $d\in S(\bar{x})$ we have $G’(\bar{x})d\in T_\mathcal{K}(G(\bar{x}))$. Then by definition of Bouligand cone there exists $t^k\downarrow0$ such that

\[D_\mathcal{Y}(G(\bar{x})+G'(\bar{x})t^kd, \mathcal{K}) = o(t^k)\]

Since RCQ holds at $\bar{x}$, there exists a $N>0$ such that for all $k>N$ we have

\[D_\mathcal{X}(\bar{x}+t^kd,\mathcal{F}) = O(D_\mathcal{Y}(G(\bar{x}+t^kd),\mathcal{K})) = o(t^k) \ \Rightarrow\ d\in T_\mathcal{F}(\bar{x})\]

Corollary 1. (Optimal Solution of LOP) Suppose $\bar{x}$ is a locally optimal solution of COP and Robinson’s CQ holds at $\bar{x}$. Then $d=0$ is an optimal solution of the LOP (the linearization of OP) defined below

\[\begin{aligned} &\min_{d\in\mathcal{X}}\ f'(\bar{x})d\quad\quad \text{s.t.}\ G'(\bar{x})d\in T_\mathcal{K}(G(\bar{x})) \end{aligned} \tag{LOP}\]


11. Proper Lagrange Multiplier Set

Remember the Lagrange multiplier set at $\bar{x}$ is

\[\mathcal{M}(\bar{x}) = \{\mu\in\mathcal{Y}^*:\nabla f(\bar{x})=\nabla G(\bar{x})\mu, -\mu \in \mathcal{N}_\mathcal{K}(G(\bar{x}))\}\]

Here, we say a set is proper if it is nonempty, convex, bounded and closed.

Note. ($\mathcal{M}(\bar{x})$ is closed) For any sequence $(\mu^k)\subset\mathcal{M}(\bar{x})$ with $\mu^k\to\mu$, we must have $-\mu\in\mathcal{N}_\mathcal{K}(G(\bar{x}))$ by the closeness of normal cone. Also since ${\mu:\nabla f(\bar{x})-\nabla G(\bar{x})\mu=0}$ is an affine subspace which is closed, we know that $\mu\in\mathcal{M}(\bar{x})$. END.

Note. ($\mathcal{M}(\bar{x})$ is convex) For any $\mu_1$ and $\mu_2$ in $\mathcal{M}(\bar{x})$ we know for all $t\in[0,1]$ we have

\[-[t\mu_1 + (1-t)\mu_2]\in\mathcal{N}_\mathcal{K}(G(\bar{x}))\]

and

\[[t+(1-t)]\nabla f(\bar{x})-\nabla G(\bar{x})[t\mu_1+(1-t)\mu_2] = 0\]

Thus, $\mathcal{M}(\bar{x})$ is convex. END.

Note. ($\mathcal{M}(\bar{x})$ is bounded $\Rightarrow$ RCQ holds at $\bar{x}$) If the RCQ does NOT hold at $\bar{x}$, then by Proposition 1 (c) we have

\[\{0\}\subset[G'(\bar{x})\mathcal{X}]^\perp \cap \mathcal{N}_\mathcal{K}(G(\bar{x})) \not= \{0\}\]

Thus there exists a $\mu_0\not=0$ such that $-\mu_0\in\mathcal{N}_\mathcal{K}(G(\bar{x}))$ and $-\mu_0\in[G’(\bar{x})\mathcal{X}]^\perp$ $\Leftrightarrow$ $\langle\mu_0,G’(\bar{x})x\rangle=0$ for all $x\in\mathcal{X}$ $\Leftrightarrow$ $\nabla G(\bar{x})\mu_0=0$. For any $\mu\in\mathcal{M}(\bar{x})$ we can verify that $\mu+t\mu_0\in\mathcal{M}(\bar{x})$ for all $t>0$: First,

\[\nabla f(\bar{x})-\nabla G(\bar{x})(\mu+t\mu_0)=0\]

Then, by the closeness and convexity of normal cone $\mathcal{N}_\mathcal{K}(G(\bar{x}))$ we have

\[-(\mu+t\mu_0) = -(1+t)\left(\frac{1}{1+t}\mu+\frac{1}{1+t}\mu_0\right) \in \mathcal{N}_\mathcal{K}(G(\bar{x}))\]

Thus, $\mathcal{M}(\bar{x})$ is unbounded. END.

Note. (RCQ holds at $\bar{x}$ $\Rightarrow$ $\mathcal{M}(\bar{x})$ is bounded) If $\mathcal{M}(\bar{x})$ is unbounded then there exists a sequence $(\mu^k)\subset\mathcal{M}(\bar{x})$ with

\[1/\|\mu^k\|\to0 \quad\text{ and }\quad \mu^k/\|\mu^k\|\to\mu\not=0\]

Then we have

(1) $\mu^k\in-\mathcal{N}_{\mathcal{K}}(G(\bar{x}))$. It implies that

\[\mu^k/\|\mu^k\|\in-\mathcal{N}_{\mathcal{K}}(G(\bar{x}))\]

and then $\mu\in-\mathcal{N}_{\mathcal{K}}(G(\bar{x}))$ by the closeness of normal cone.

(2) $\nabla f(\bar{x}) - \nabla G(\bar{x})\mu^k = 0$. Divided by $|\mu^k|$ on both sides and we have

\[\frac{\nabla f(\bar{x}) - \nabla G(\bar{x})\mu^k}{\|\mu^k\|} = 0\]

Letting $k\to\infty$ we have $\nabla G(\bar{x})\mu=0$ $\Leftrightarrow$ $\mu\in[G’(\bar{x})\mathcal{X}]^\perp$.

Then (1) and (2) contradict to Proposition 1 (c). END.

Note. (RCQ holds at $\bar{x}\Rightarrow \mathcal{M}(\bar{x})$ is nonempty) Denote by

\[\bar{N} = \{s\in\mathcal{X}:s=\nabla G(\bar{x})\mu,-\mu\in\mathcal{N}_\mathcal{K}(G(\bar{x}))\}\]

It’s obvious that $\bar{N}$ is a convex cone. Now we want to show its closeness. For any $(z^k)\subset\bar{N}$ that converges to $z$, there exists a sequence $(\mu^k)$ with $\mu^k\in-\mathcal{N}_{\mathcal{K}}(G(\bar{x}))$ and $z^k=\nabla G(\bar{x})\mu^k$ for all $k$.

Case A, if $(\mu^k)$ is bounded then there must be a sub-sequence $(\mu^{i_k})$ that converges to $\mu\in-\mathcal{N}_\mathcal{K}(\bar{x})$ by the closeness of normal cone. Then we must have $z^{i_k}\to z’$ with $z’=\nabla G(\bar{x})\mu\in\bar{N}$. Since $z^k\to z$ we know $z’=z$. Thus $z\in\bar{N}$.

Case B, if $(\mu^k)$ is unbounded, then by the closeness of normal cone there must be a subsequence

\[1/\left\|\mu^{i_k}\right\|\to0 \quad\text{ with }\quad \mu^{i_k}/\left\|\mu^{i_k}\right\|\to\mu\in-\mathcal{N}_\mathcal{K}(\bar{x})\]

where $\mu\not=0$. Note that $(z^k)$ is bounded since it converges, we know

\[z^{i_k}/\|\mu^{i_k}\| = \nabla G(\bar{x})\mu^{i_k}/\|\mu^{i_k}\|\to0\]

which implies that $\nabla G(\bar{x})\mu=0$, and furthermore $\mu\in[G’(\bar{x})\mathcal{X}]^\perp$. That contradicts to Robinson’s CQ by Proposition 1 (c). Thus, $\bar{N}$ is closed. Finally, $\bar{N}$ is a convex closed cone.

Next, we shall show that $\nabla f(\bar{x})\in\bar{N}$. If NOT, let

\[0\not=\bar{d} \equiv \Pi_\bar{N}(\nabla f(\bar{x}))-\nabla f(\bar{x}) \in\mathcal{X}\]

Since $\bar{N}$ is a closed convex cone, by the properties of metric projection we have

\[\begin{aligned} \nabla f(\bar{x})\bar{d} &= \langle \nabla f(\bar{x})-\Pi_\bar{N}(\nabla f(\bar{x})), -\nabla f(\bar{x}) \rangle\\ &= \langle \nabla f(\bar{x})-\Pi_\bar{N}(\nabla f(\bar{x})), \Pi_\bar{N}(\nabla f(\bar{x}))-\nabla f(\bar{x}) \rangle = -\|\bar{d}\|^2 < 0 \end{aligned}\]

and for all $s\in\bar{N}$ we have

\[\langle \nabla f(\bar{x})-\Pi_\bar{N}(\nabla f(\bar{x})), s-\Pi_\bar{N}(\nabla f(\bar{x}))\rangle \le 0 \Leftrightarrow \langle \bar{d},s \rangle \ge 0\]

It follows that for all $\mu\in-\mathcal{N}_\mathcal{K}(G(\bar{x}))$ we have

\[0 \le \langle \bar{d},\nabla G(\bar{x})\mu \rangle = \langle G'(\bar{x})\bar{d},\mu\rangle \Rightarrow \langle G'(\bar{x})\bar{d},-\mu\rangle \le 0\]

meaning that $G’(\bar{x})\bar{d}\in\mathcal{T}_\mathcal{K}(G(\bar{x}))$. Then $\nabla f(\bar{x})\bar{d}<0$ contradicts to Corollary 1 (Since $d=0$ optimizes the LOP, we should have $\nabla f(\bar{x})\bar{d}\ge0$ if $d\not=0$). END.


Finally, we conclude the main result of this note in the following Proposition:

Proposition 8. (RCQ $\Leftrightarrow$ Proper Lagrange Multiplier) Suppose $\bar{x}$ is a locally optimal solution of COP. Then the Lagrange multiplier set $\mathcal{M}(\bar{x})\subset\mathcal{Y}$ is a nonempty, convex, bounded and compact set if and only if Robinson’s CQ holds at $\bar{x}$.