The question is whether or not attacking should be possible independent of movement. If no (e.g. in a fighting game), I would add a displacement component to the Attack state and maybe rename Move to MovementInput. And when the player enters the Attack state, it takes control of all movement until it finishes. The Attack-Move is then just a normal Attack with a modified movement control scheme.
If you want the 2 to be independent (e.g. in a twin-stick shooter), then I would define a 2 separate state machines: (Hold, Walk, Run) and (Wait, Attack, Deflect). Both run in parallel at the same time, but they can call each other without modifying their internal state. E.g. if I'm currently in Walk, I can press M2 to initiate Deflect, but this doesn't take me out of Walk. The meta-state machine would then consist of tuples: ((Hold, Wait), (Hold, Attack), (Hold, Deflect), ... , (Run, Attack), (Run, Deflect)).
Although, I would really try to avoid implementing it explicitly, unless you have a very complex control scheme.
Both approaches can scale with well-defined states. The 2nd one perhaps less, because of the amount possible connections between a large number of FSMs, but that's a problem with every modular system. It really depends on the type of behaviour your want to model.