Project Page
SEAL logo
SEAL : Semantic-aware Single-image Sticker
Personalization with a Large-scale Sticker-tag Dataset
Chung-Ang University NAVER Cloud Lunit Inc.
*Co-corresponding Authors
TL;DR: A plug-and-play semantic adaptation module for single-image sticker personalization, designed to mitigate visual entanglement and structural rigidity in test-time fine-tuning pipelines.
CoRe-based Results
Reference
Input 1
Output 1
Output 1-1
Output 2
Output 1-2
S*, sitting at desk, hands on table, sticker style, green background
Reference
Input 2
Output 1
Output 2-1
Output 2
Output 2-2
S*, standing, close-up front view, animation style, cityscape background
Reference
Input 3
Output 1
Output 3-1
Output 2
Output 3-2
S*, front view, anime style, forest background
UnZipLoRA-based Results
Reference
Input 4
Output 1
Output 4-1
Output 2
Output 4-2
S*, leaning over bed, sticker style, room background
Reference
Input 5
Output 1
Output 5-1
Output 2
Output 5-2
S*, arms stretched, front view, sticker style, room background
Reference
Input 6
Output 1
Output 6-1
Output 2
Output 6-2
S*, holding canned food, close-up front view, sticker style, cityscape background
SEAL generated results overview

Synthesizing a target concept from a single reference image remains challenging in diffusion-based personalized text-to-image generation, particularly when prompts require explicit attribute edits. In the sticker domain, test-time fine-tuning methods often overfit to the reference image and suffer from visual entanglement and structural rigidity. We introduce SEAL, a plug-and-play, architecture-agnostic adaptation module that combines a Semantic-guided Spatial Attention Loss, a Split-merge Token Strategy, and a Structure-aware Layer Restriction. We also introduce StickerBench, a large-scale sticker dataset with six structured attributes for controlled evaluation of identity disentanglement and contextual controllability in single-image sticker personalization.

Method

SEAL is a plug-and-play module for existing personalization pipelines.

SEAL method overview
1

Semantic-guided Spatial Attention Loss

Aligns the concept-token cross-attention map with the object region predicted by SAM, suppressing background leakage and improving identity disentanglement.

2

Split-merge Token Strategy

Optimizes multiple auxiliary embeddings and merges them into one concept embedding, improving optimization stability under single-image supervision.

3

Structure-aware Layer Restriction

Applies spatial supervision only to semantically informative cross-attention layers, reducing overfitting to low-level layout patterns.

Key Insights

Structural Rigidity and Background Entanglement

SEAL is designed to reduce the two dominant failure modes in single-image sticker personalization.

Structural Rigidity

Existing methods often memorize the reference-specific layout, which reduces flexibility under action, pose, and composition edits.

Input Reference
REFERENCE
:
Baseline result with problem
BASELINE
+ SEAL
SEAL result
Ours

Background Entanglement

Existing methods may absorb background cues into the learned concept representation, causing identity and context to become entangled.

Input Reference
REFERENCE
:
Baseline result with problem
BASELINE
+ SEAL
SEAL result
Ours

Results

Qualitative and quantitative results across representative personalization pipelines.

Qualitative comparison

Table 1.

Qualitative comparison across baseline methods and SEAL-integrated variants.

Ablation study of SEAL

Figure 1.

Visual ablation study of SEAL on StickerBench for single-image sticker personalization using CoRe.

Ablation study of SEAL

Table 2.

Ablation study of the proposed adaptation module on StickerBench for single-image sticker personalization, using CoRe.

Ablation study of SEAL

Figure 2.

Visual analysis of structural rigidity with respect to Structure-aware Layer Restriction during embedding adaptation.

Ablation study of SEAL

Figure 3.

Detailed analysis of cross-attention maps at the end of embedding adaptation(250 steps) and at inference.

Ablation study of SEAL

Figure 4.

Inference-time visualization of cross-attention maps across different K values.

Ablation study of SEAL

Table 3.

Ablation study on the split token count K for the Split-merge Token Strategy.

Ablation study of SEAL

Figure 5.

Visual analysis of optimization stability with respect to the number of split tokens K.

Ablation study of SEAL

Table 4.

Ablation study on prompt representations for training and inference on StickerBench for single-image sticker personalization

Dataset

StickerBench

StickerBench is a large-scale sticker dataset built for controlled evaluation of single-image sticker personalization. Each sample is annotated with structured tags under a six-attribute schema: Appearance, Emotion, Action, Camera Composition, Style, and Background.

This tag-based representation supports systematic prompt editing while keeping the target identity fixed, making it especially suitable for evaluating identity disentanglement and contextual controllability.

261K sticker images
Six structured attributes
Tag-based prompt interface
Built for controlled evaluation

BibTeX

@article{seal2026,
  title   = {SEAL: Semantic-aware Single-image Sticker Personalization with a Large-scale Sticker-tag Dataset},
  author  = {Author Names},
  journal = {Expert Systems with Applications},
  year    = {2026}
}