Image Generation from Scene Graphs #100

wrryu09 · 2023-01-22T14:08:59Z

Justin Johnson, Agrim Gupta, Li Fei-Fei
; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 1219-1228

🖍️ scence graph로부터 많은 물체와 관계들을 가진 복잡한 이미지를 생성하기 위한 end-to-end method 만들기

텍스트 설명으로부터 이미지를 생성하는 기존 방법들과는 다르게 structured scene graph에서 이미지를 생성하는 것은 object과 relationships를 명시적으로 reason ⇒ 많은 알아볼 수 있는 object 가진 복잡한 이미지를 생성할 수 있음

graph convolution으로 인풋 그래프 제작, object 마스크 분할과 바운딩 박스 예측을 통해 scene 레이아웃 계산, cascaded refinement network로 레이아웃을 이미지로 변환

scene graph

복잡한 문장으로 전달된 정보는 scene graph를 사용하면 objects와 그들의 관계에 대해 더 명시적으로 표현 가능. Scene graph는 semantic image retrieval, evaluating and improving image captioning에 사용되어 이미지와 언어 모두에게 powerful structured representation

processing scene graph input → use graph convolution network which passes information along graph edges
after processing graph, must bridge the gap btw the symbolic graph-structured input and the two-dimensional image output (to this end.. 그래프에 있는 모든 물체의 바운딩 박스와 분할 마스크를 예측해 scene layout 만듦)
예측된 레이아웃 만족하는 이미지 생성하기 위해 cascaded refinement network(CRN) 사용, layout at increasing spatial scales로 만들어줌
마지막으로 생성된 이미지가 리얼하고 알아볼 수 있는 물체 포함하도록 보장해야 함 → train adversarially against a pair of discriminator networks operation on image patches and generated objects.
Visual Genome, COCO Stuff 데이터셋으로 복잡한 이미지 생성 능력, 모델의 각 컴포넌트 검증 위한 comprehensive ablations 수행 능력의 질적 결과 실험함

primary challenge

must develop a method for processign the graph-structured input
must ensure that the generated images respect the objects and relationships specified by the graph
must ensure that the synthesized images are realistic

▫️

scene graph → images 변환은 scene graph G 와 noise z 를 인풋으로 해 이미지 I를 생성하는 image generation network f 로 함

G는 각 물체에게 임베딩 벡터 주는 graph convolution network로 생성, graph convolution의 각 레이어가 그래프 모서리 따라서 정보 혼합

물체 임베딩 벡터 사용해서 알아낸 G 에 있는 물체와 관계를 각 물체의 바운딩 박스와 분할 마스크 생성에 이용해 이것을 결합해 scene layout 만듦

▫️

아웃풋 이미지 I는 CRN 사용한 레이아웃에서 생성

각각의 모듈이 증가하는 공간 스케일로 레이아웃 생성해 결과적으로 이미지 I 만듦

▫️

이미지 I가 realistic하게 보이게 하고, realistic, recognizable 물체들을 보이게 하도록 하는 f 를adversarially train against a pair of discriminator networks Dimg and Dobj ⇒ realistic image 생성

wrryu09 added CV Seungyeon 2018 CVPR labels Jan 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image Generation from Scene Graphs #100

Image Generation from Scene Graphs #100

wrryu09 commented Jan 22, 2023

Image Generation from Scene Graphs #100

Image Generation from Scene Graphs #100

Comments

wrryu09 commented Jan 22, 2023