Target: It is necessary to create two spheres, one of which can be rolled over the surface of the other with the mouse, and implement a camera that can be moved around these balls using the keyboard.

Implementation: I started a matrix that stores the current state of the rotation of the rolling ball. When the user drags, I get a series of mouse move events, and each time I move, I calculate how many degrees around the current X and Y, as the user sees them, the rotation has changed. Then I calculate a matrix that represents these two rotations and multiply the original sphere rotation matrix by it in reverse order – the reverse order is necessary because the rotation occurs from the point of view of the camera, and not from the point of view of model space.

Problem: But with such an implementation, the second sphere will not change the point of contact with the first sphere (it will, as it were, slide along it), how can one analytically implement the rotation of the point of contact of the balls in terms of matrices?

Here is the code if anyone is interested: https://github.com/AndrewStrizh/spheres-with-webGL

## Answer

What you need is to be able to control rotation of your sphere around two (or more) different rotation pivots.

A proper way to deal with complex transformations is to implement hierarchical transformations:

http://web.cse.ohio-state.edu/~wang.3602/courses/cse3541-2019-fall/05-Hierarchical.pdf

In this case, you can control the rotation of the `sphereB`

around the `sphereA`

by making the `sphereB`

a child of an third invisible object – call it `Locator`

– located at the center of the `sphereA`

. With proper implementation of hierarchical transformations, rotating the `Locator`

will also rotate the `sphereB`

around this `Locator`

(so, around the `sphereA`

). In the same time, you can also apply a rotation of the `sphereB`

around its own center, making it spinning.

In practice, implementing true hierarchical transformations require to implement a scene graph, with proper nodes traversal, etc. But the main idea is that every object have what is called a local transform matrix, and world transform matrix. The local transform matrix hold only the own transformation of that particular object (locally to its own origin), while the world transform matrix is the final matrix, sum result of all the hierarchical transformations (from parents) applied to this object. The world transform matrix is the one used as “model” matrix, to be multiplied with the view and projection matrices. World and local transform matrices of nodes are computed like this (pseudocode):

`node.worldMatrix = node.localMatrix * node.parent.worldMatrix;`

Knowing that, since you only need three objects and two hierarchical transformations, you don’t have to implement a whole scene graph, you only need to simulate this principle by multiplying proper matrices to reproduce the desired behavior.