Seunghyeon Seo

I'm a Ph.D. candidate at College of Engineering in Seoul National University, where I'm advised by Prof. Nojun Kwak in the Machine Intelligence and Pattern Analysis Lab (MIPAL). Previously, I got my bachelor's degree from College of Agriculture and Life Sciences in SNU, where I majored in Agricultural Economics.

I'm interested in computer vision, deep learning, and neural rendering. Much of my interest is currently focused on the efficient training framework of NeRF/3D-GS and synthetic data training leveraging diffusion models.

On the job market now—looking for research scientist/engineer roles where I can contribute, learn, and collaborate with great folks. Always happy to chat!

Email  /  CV  /  Google Scholar  /  LinkedIn  /  Github

profile photo
News
Research
Generative Head-Mounted Camera Captures for Photorealistic Avatars
Shaojie Bai*, Seunghyeon Seo*, Yida Wang, Chenghui Li, Owen Wang, Te-Li Wang, Tianyang Ma, Jason Saragih, Shih-En Wei, Nojun Kwak, Hyung Jun Kim
Under Review
project page

We present GenHMC, a generative diffusion framework that synthesizes photorealistic head-mounted camera (HMC) images from avatar renderings. By enabling high-quality unpaired training data generation, GenHMC facilitates scalable training of facial encoders for Codec Avatars and generalizes well across diverse identities and expressions.

ROODI: Reconstructing Occluded Objects with Denoising Inpainters
Yeonjin Chang, Erqun Dong, Seunghyeon Seo, Nojun Kwak, Kwang Moo Yi
Under Review
project page / arXiv

We propose ROODI, a method for extracting and reconstructing 3D objects in the presence of occlusions using Gaussian Splatting. It first removes irrelevant splats based on a KNN-based pruning strategy, then completes the occluded regions using a diffusion-based generative inpainting model, enabling high-quality geometry recovery even under heavy occlusion.

DivCon-NeRF: Generating Augmented Rays with Diversity and Consistency for Few-shot View Synthesis
Ingyun Lee, Jae Won Jang, Seunghyeon Seo, Nojun Kwak
Under Review
arXiv

We propose DivCon-NeRF, a novel ray augmentation method designed for few-shot novel view synthesis. By introducing surface-sphere and inner-sphere augmentation techniques, our method effectively balances ray diversity and geometric consistency, which helps suppress floaters and appearance artifacts often seen in sparse-input settings.

ARC-NeRF: Area Ray Casting for Broader Unseen View Coverage in Few-shot Object Rendering
Seunghyeon Seo, Yeonjin Chang, Jayeon Yoo, Seungwoo Lee, Hojun Lee, Nojun Kwak
CVPR 2025 Workshop on Computer Vision for Metaverse   (Oral)
project page / arXiv

We introduce ARC-NeRF, a few-shot rendering method that casts area rays to cover a broader set of unseen viewpoints, improving spatial generalization with minimal input. Alongside, we propose adaptive frequency regularization and luminance consistency loss to further refine textures and high-frequency details in rendered outputs.

Unleash the Potential of CLIP for Video Highlight Detection
Donghoon Han*, Seunghyeon Seo*, Eunhwan Park, SeongUk Nam, Nojun Kwak
CVPR 2024 Workshop on Efficient Large Vision Models
arXiv

We introduce HL-CLIP, a CLIP-based video highlight detection framework that leverages the strong semantic alignment of pre-trained vision-language models. By fine-tuning the visual encoder and applying a saliency-based temporal pooling technique, our method achieves state-of-the-art performance with minimal domain-specific supervision.

Fast Sun-aligned Outdoor Scene Relighting based on TensoRF
Yeonjin Chang, Yearim Kim, Seunghyeon Seo, Jung Yi, Nojun Kwak
WACV 2024
arXiv

We present SR-TensoRF, a sun-aligned relighting approach for NeRF-style outdoor scenes that does not rely on environment maps. By aligning lighting with solar movement and using a cubemap-based TensoRF backbone, our method enables realistic and fast relighting for dynamic outdoor scenes with consistent directional light simulation.

ConcatPlexer: Additional Dim1 Batching for Faster ViTs
Donghoon Han, Seunghyeon Seo, DongHyeon Jeon, Jiho Jang, Chaerin Kong, Nojun Kwak
NeurIPS 2023 Workshop on Advancing Neural Network Training   (Oral)
arXiv

We propose ConcatPlexer, a simple yet effective batching strategy that accelerates Vision Transformer (ViT) inference by concatenating visual tokens along an additional dimension. This approach preserves model accuracy while improving inference throughput, requiring no architectural changes and offering easy integration into existing ViT pipelines.

FlipNeRF: Flipped Reflection Rays for Few-shot Novel View Synthesis
Seunghyeon Seo, Yeonjin Chang, Nojun Kwak
ICCV 2023
project page / code / video / arXiv

We present FlipNeRF, a framework that utilizes flipped reflection rays derived from input images to simulate novel training views. This approach enhances surface normal estimation and rendering fidelity, enabling better generalization in few-shot novel view synthesis without requiring additional images or supervision.

MDPose: Real-Time Multi-Person Pose Estimation via Mixture Density Model
Seunghyeon Seo, Jaeyoung Yoo, Jihye Hwang, Nojun Kwak
UAI 2023
arXiv

We introduce MDPose, a real-time multi-person pose estimation method based on mixture density modeling. By randomly grouping keypoints and modeling their joint distribution without relying on person-specific instance IDs, MDPose achieves high accuracy and real-time performance even in crowded scenes with complex pose variations.

End-to-End Multi-Object Detection with a Regularized Mixture Model
Jaeyoung Yoo*, Hojun Lee*, Seunghyeon Seo, Inseop Chung, Nojun Kwak
ICML 2023
code / arXiv

We present D-RMM, an end-to-end multi-object detection framework that models object locations using a regularized mixture density model. The training objective includes a novel Maximum Component Maximization (MCM) loss that prevents duplicate detections, resulting in improved accuracy and stability in both dense and sparse detection scenarios.

MixNeRF: Modeling a Ray with Mixture Density for Novel View Synthesis from Sparse Inputs
Seunghyeon Seo, Donghoon Han*, Yeonjin Chang*, Nojun Kwak
CVPR 2023   (Qualcomm Innovation Fellowship Korea 2023 Winner)
project page / code / video / arXiv

We propose MixNeRF, which models each camera ray as a mixture of Laplacian densities to better capture multi-modal RGB distribution in sparsely sampled scenes. Our framework includes a depth prediction auxiliary task and mixture regularization loss, allowing for more accurate novel view synthesis in few-shot NeRF settings.

MUM: Mix Image Tiles and UnMix Feature Tiles for Semi-Supervised Object Detection
JongMok Kim, Jooyoung Jang, Seunghyeon Seo, Jisoo Jeong, Jongkeun Na, Nojun Kwak
CVPR 2022
code / arXiv

We introduce MUM, a semi-supervised object detection framework that applies strong spatial data augmentation by mixing image tiles and unmixing their corresponding features. This strategy allows the model to benefit from mixed inputs without corrupting label supervision, leading to improved performance in low-label regimes on COCO and VOC benchmarks.


Thanks for sharing the website template, Jon Barron. :)