We are working on polishing the code to make it available as soon as possible.
In image recognition, both foreground (FG) and background (BG) play an important role; however, standard deep image recognition often leads to unintended over-reliance on the \bg{}, limiting model robustness in real-world deployment settings. Current solutions mainly suppress theBG, sacrificing BG information for improved generalization.
We propose "Segment to Recognize Robustly" (S2R2), a novel recognition approach which decouples the FG and BG modelling and combines them in a simple, robust and interpretable manner. S2R2 leverages recent advances in zero-shot segmentation to isolate the FG and the BG before or during recognition. By combining FG and BG, potentially also with standard full image classifier,S2R2 achieves state-of-the-art results on in-domain data while maintaining robustness to BG shifts.