57 views

Skip to first unread message

Aug 14, 2021, 12:17:55 PMAug 14

to Manopt

I have a quartic objective function listed as follow, could you please help me calculate the Euclidean gradient and Hessian matrix of the objective function and transform them into their Riemannian versions.

Aug 16, 2021, 8:59:47 AMAug 16

to Manopt

Hello Robert,

Best,

Could you tell us more about what you have tried so far? This will make it easier to identify the concept that needs clarification.

Best,

Nicolas

Aug 19, 2021, 8:10:06 AMAug 19

to Manopt

Hello Nicolas,

Thanks very much for your kind help.

I don’t know how to use the function of uploading files, so I uploaded a few pictures.

The work I have tried so for was attached as belows.

Best,

Robert

Aug 20, 2021, 3:04:12 AMAug 20

to Manopt

Hello,

There is no need to differentiate "with respect to $s$ or $s^*$".Formally, here's how to proceed:

- We think of $C^n$ as a
*real*vector space of dimension $2n$. Indeed, if $e_1, \ldots, e_n$ are the columns of the identity matrix, then $e_1, i e_1, e_2, i e_2, \ldots, e_n, i e_n$ are $2n$ vectors which form a basis for $C^n$: every vector in $C^n$ can be expressed (uniquely) as a linear combination of those $2n$ vectors, with*real*coefficients. - On $C^n$, we define a
*real*inner product: see the Euclidean reminders early in Chapter 3. Explicitly, for $u, v \in C^n$, we define $<u, v> = Re(u* v)$ where $Re$ extracts the real part. - If $f : C^n \to R$ is your cost function, then its gradient at $x$ is defined as the vector $grad f(x) \in C^n$ such that $D f(x)[v] = <grad f(x), v>$ for all $v \in C^n$. The left-hand side of that is the directional derivative: $D f(x)[v] = lim_{t \to 0} (f(x+tv) - f(x))/t$: just work out that limit, the normal way. For example, complex conjugation is a linear map (since we allow only real coefficients in linear combinations): it's easy to differentiate through linear maps. Then you need to manipulate the expression you find for $D f(x)[v]$ until it looks like $<...., v>$ where $<., .>$ is that inner product defined above. The .... is the gradient of $f$ on $C^n$.
- You can give this gradient as problem.egrad in Manopt: the toolbox will take care of transforming it into a Riemannian gradient if your manifold is a Riemannian submanifold of $C^n$, such as the phase manifolds for example.

I hope this clarifies things.

Best wishes,

Nicolas

Aug 27, 2021, 10:44:58 PMAug 27

to Manopt

Hello, Nicolas

Thanks very much for your time and kind help.

I worked out the Gradient and Hessian matrix of the objective function.

Now, I want to solve this minimization problem using the ARC algorithm in the 'manopt' toolbox.

In page 26 of your paper *Adaptive regularization with cubics on manifolds,* for ARC, the equation (3) are not checked to slove the subproblem. As far as I know, to avoid the saddle points of a optimization probllem, the gradient of the cost function should be 0, and the Hessian matrix should be positive define. If only the first-order condition is considered, can it be guaranteed that saddle points are avoided?

Moreover, in equation (2) and (3) on the paper, the gradient is not strictly equal to 0, but less than a value. The matrix is not strictly positive definite, but the smallest eigenvalue is greater than a value. What's the reason for doing this？

Thanks for your time again.

Best wishes,

Robert

Aug 29, 2021, 12:30:26 PMAug 29

to Manopt

Hello Roberts,

>
If only the first-order condition is considered, can it be guaranteed that saddle points are avoided?

Guaranteed, no. However, in practice, it's unlikely that you would converge to a saddle point because they are unstable. They can slow down algorithms significantly though. Sometimes people add small random perturbations in certain places in their algorithms to escape saddle points. For example, with the trustregions algorithm in Manopt you can set "option.useRand = true:" This is a heuristic.

> Moreover, in equation (2) and (3) on the paper, the gradient is not strictly equal to 0, but less than a value. The matrix is not strictly positive definite, but the smallest eigenvalue is greater than a value. What's the reason for doing this？

The reason is that it it is not possible to guarantee the computation of an exact critical point (be it first or second order) in finite time. This is for the same reason that it is not possible to find the exact roots of a polynomial of degree >= 5 in finite time: nonlinear problems are just difficult in this way. The best we can do is find points which approximately satisfy the conditions we aim for.

Best,

Nicolas

Aug 29, 2021, 12:30:56 PMAug 29

to Manopt

(sorry for the extraneous 's' in your name!)

Reply all

Reply to author

Forward

Message has been deleted

0 new messages

Search

Clear search

Close search

Google apps

Main menu