Use only one camera and have some other objects parented to the players as location trackers, then lerp the camera location between those two?
This would make the camera it's own entity in the scene effectively. Not parented to either player and free to move on it's own.
Or you could just create a third camera for the transition and just switch to it in-between the other two.