The animal would have to be able to see and hear its handler. It is able to understand simple commands. Ergo:
1/ A single handler per animal should be nominated. Lose the handler; lose control of the animal. That's the cost
2/ The handler has to be visible to the animal.
3/ Erudite wit works up to 12". The handler has to be within 12" for the animal to follow orders.
4/ If the handler is out-of-sight or over 12" away, then lose control of the animal.
5/ Simple commands are those a dog would understand. Go, Fetch, Return, Attack, Stay. I imagine that "Fetch" is the one at the root of the problem. So make the objective something which either cannot be recognised by a simple animal (Secret Papers amongst other Papers) or too big to be picked up in the animal's mouth.