Basically what I want to know is how to remove or lessen voices that are close to the camera's mic, when I am really interested in keeping the audio of the subject I am focused/zoomed in on?
No. There's not much you can do... You can use (dynamic) compression to make the sounds "more equal" (quiet sounds louder and/or loud sounds quieter), but you cannot "invert" the loudness to make the loud-sounds quieter than the quiet-sounds, or vice-versa. Or you can isolate the left, right, or "center" audio (which won't help you in this case).
Maybe another parent was recording from a different location, and you can borrow some footage?
Pros use multi-track recording to record all voices & the instruments separately. That way they can adjust each voice/instrument separately, or replace any given track, or record the different parts at different times. But once it's mixed-down to stereo (or surround),
"You can't un-fry and egg, you can't un-bake a cake, and you can't un-mix an audio track."
In this case, editing the clip or adding music and fading out the clip audio won't cut it.
Mute and add subtitles? Re-record the singing? That's what the pros do...

Almost all on-location dialog is re-recorded inthe studio... (Add some room noise & reverb for realism.)
...that she is holding the point-n-shoot camera upside down.
That can probably be fixed, except the part where the camera is rotating back to the correct position.
P.S.

Next time, wear
this shirt!
