Shadowphile Well if you already have a good enough working system - go with that. Things don't need to be perfect. Just accept its limitations and adapt to the constraints. If the constraints are not acceptable, you'll need to re-engineer one way or another. As always - precisely identify the bottlenecks and work to minimize their impact.
Be willing to compromise. As they say - perfect is the enemy of done 🙂
If you're burned out on the technical side, I'd just lower the resolution of all images to a quarter, and finish to a playable demo, focusing on gameplay, visuals and story. Even the worst kind of brute force may be good enough here. Start thinking about optimizations and system redesign only if/when the game as a whole becomes interesting enough to impartial audience.
That said, basing the system on viewports + background loading should perform well for this type of game in most situations, especially if you don't unload locations that are nearby but not directly adjacent. The pace of play in games like this is rather slow. If you cache enough locations you should be able to get away without displaying any loaders.
Even with location caching, there could always be unexpected situations where a location is not ready if the player moves quickly through a sequence of locations to a yet unseen location. In that case be ready to display a loader disguised as a time-stretchable transition animation.
You can ask anything. If I know the answer I'll do my best to help you.