@article{wang2023repositioning,
title={Repositioning the Subject within Image},
author={Wang, Yikai and Cao, Chenjie and Dong, Qiaole and Li, Yifan and Fu, Yanwei},
year={2023},
journal={arXiv preprint arXiv: 2401.16861},
}
SEELE employs an interactive pre-processing, manipulation, and post-processing pipeline for subject repositioning. During the pre-processing phase, SEELE identifies the subject using the segmentation model, guided by user-provided conditions, and maintains the occlusion relationships between subjects intact. In the manipulation stage, SEELE manipulates the image to fill in any left gaps. Furthermore, SEELE rectifies the obscured subject with user-specified incomplete masks. In the post-processing phase, SEELE addresses any disparities between the repositioned subject and its new surroundings.
We curated a benchmark dataset called ReS. This dataset includes 100 paired images, each with dimensions 4032×3024, where one image features a repositioned subject while the other elements remain constant. These images were collected from over 20 indoor and outdoor scenes, showcasing subjects from more than 50 categories. This diversity enables effective simulation of realworld open-vocabulary applications. The dataset is available at here.
We present results of SEELE on 1024 x 1024 images.
We also assess the effectiveness of various components within SEELE during both pre-processing and post-processing phases.