mocha Pro would use the background track tries to replace the selct foreground object with clean areas from the background (from other frames in time). Also employs some intelligence to matching the lighting, etc. This can be very successful if the background is non-organic and there is enough parallax camera movement. In cases where the foreground passes multiple "planes" each bg plane needs to be tracked on their own layer.
When the clean frame does not exist in the shot at all, the artist can feed the module clean, painted frames. These clean frames can be interpolated into the shot and even mixed with the auto removed frames.
There is a little learning curve, but I would suggest watching Mary Poplin's Remove Module tutorials that cover clean plates.