Is there any way this can be used to generate multiple images of the same model? e.g. a car model rotated around (but all images are of the same generated car)
Yes, input image => embedding => N images, and if you're thinking 3D perspectives for rendering, you'd ControlNet the N.
ref.: "The model can also understand image embeddings, which makes it possible to generate variations of a given image (left). There was no prompt given here."
The model looks different in each of those variations though. Which seems to be intentional, but the post you're responding to is asking whether it's possible to keep the model exactly the same in each render, varying only by perspective.