google street view – What is the difference between initial vs pose heading/tilt/roll?

I’m trying to understand Photo Sphere XMP Metadata specification which defines “initial” and “pose” variants for heading/pitch/roll orientations. The euler overview specifies these are local frames relative earth North, so seems like a single triplet would be sufficient.

Are there two because some cameras might power on with some initial calibrated orientation, then encode changes from the initial orientation in the pose field? In that case, would you need to concatenate the orientations to get the final orientation?

Or is there a semantic difference between these two? For example, initial may be the camera orientation, but we wish present the panorama with a different orientation. For example, if we want the viewer to pan around some orientation different than the camera’s because it’s mounted sideways or something.

For context, my GoPro Max does not record any orientation quantities, and am manually correcting orientation myself. I would like to record its orientation both in respect to the local frame, but also relative the vehicle trajectory.