Best method, the second is we will consider only Action that has Q-Estimated positively only, that is, the shooting was by the final value will be negative, regardless of who is at risk.
The method that two is we will consider only the Action with Q-Estimated positive only, which is shot by regardless of the Qing was negative. Which is of any risk.