Skip to content

Minje Kim's Home

Minje Kim's Home Minje Kim's Home

Innovation and Education in AI and Audio Processing

  • Home
  • News
  • Team
  • Research Projects
    • Personalized Speech Enhancement
      • Collaborative Deep Learning
      • Sparse Mixture of Local Experts
      • Knowledge Distillation for PSE
      • Self-Supervised Learning and Data Purification for PSE
      • TGIF: A Family-Owned Voice AI
    • Music Applications
      • Neural Pitch Correction of Singing Voice
      • SpaIn-Net: Spatially Informed Music Source Separation
      • Don’t Separate, Learn to Remix: End-to-End Neural Remixing
      • Neural Upmixing via Style Transfer
    • Neural Speech and Audio Coding
      • Audio Coding for Machines
      • LaDiffCodec: Generative De-Quantization for Neural Speech Codec via Latent Diffusion
      • Personalized Neural Speech Codec
      • From Hallucination to Articulation: Language Model-Driven Losses for Neural Speech Coding
      • Psychoacoustic Loss Functions for Neural Audio Coding
      • Cross-Module Residual Learning for Neural Audio Coding
      • Source-Aware Neural Audio Coding
    • Collaborative Audio Enhancement
    • Scalable and Efficient AI
      • AD-FlowTSE: Adaptive Deterministic Flow Matching for Target Speaker Extraction
      • BLOOM-Net: Scalability Matters
      • Scalable and Efficient Speech Enhancement Using Modified Cold Diffusion
      • Learning to Hash for Source Separation
  • Publication
  • Blog
  • Prospective Students
CV

Featured Projects

Research Projects » Featured Projects
TGIF: A Family-Owned Voice AI
Posted inFeatured Projects Research Projects

TGIF: A Family-Owned Voice AI

Overview In everyday life, our devices run many speech/audio applications that can benefit from the target speaker extraction (TSE) concept: the ability to pull out your voice from a noisy mixture of sounds.…
Posted by minje July 11, 2025
Audio Coding for Machines
Posted inFeatured Projects Research Projects

Audio Coding for Machines

Machine-Learned Latent Features Are Codes for That Machine! When we think about compressing sound, we usually imagine MP3s or AACs, or even neural codecs with a higher compression ratio these…
Posted by minje July 11, 2025
Personalized Neural Speech Codec
Posted inFeatured Projects Research Projects

Personalized Neural Speech Codec

[latexpage] Have you ever wondered about a speech codec that's dedicated to your speech trait? Why? Of course, it is to reduce the bitrate while maintaining the speech quality after…
Posted by minje March 21, 2024
Scalable and Efficient Speech Enhancement Using Modified Cold Diffusion
Posted inFeatured Projects Research Projects

Scalable and Efficient Speech Enhancement Using Modified Cold Diffusion

[latexpage] As we've proposed in the BLOOM-Net project, scalability matters. Just to reiterate the argument here once again, the main issue with the current deep learning-based models for speech enhancement…
Posted by minje October 9, 2023
LaDiffCodec: Generative De-Quantization for Neural Speech Codec via Latent Diffusion
Posted inFeatured Projects Research Projects

LaDiffCodec: Generative De-Quantization for Neural Speech Codec via Latent Diffusion

[latexpage] Motivation We bring the cool generative power of a diffusion model to speech coding. We call our codec LaDiffCodec as it is actively using the concept of latent diffusion((R.…
Posted by minje October 9, 2023
Don’t Separate, Learn to Remix: End-to-End Neural Remixing
Posted inFeatured Projects Research Projects

Don’t Separate, Learn to Remix: End-to-End Neural Remixing

TLDR: In this project, we developed an end-to-end neural network system that takes a music mixture as input and produces its remixed version per the user's intended volume changes of…
Posted by minje February 21, 2022
SpaIn-Net: Spatially Informed Music Source Separation
Posted inFeatured Projects Research Projects

SpaIn-Net: Spatially Informed Music Source Separation

The spatial image of a music source is an essential feature in the stereophonic music listening experience. An extreme case would be "Yellow Submarine" by The Beatles, where all the…
Posted by minje February 18, 2022
BLOOM-Net: Scalability Matters
Posted inFeatured Projects Research Projects

BLOOM-Net: Scalability Matters

[latexpage] Scalability is a big deal when it comes to video coding. When you watch a movie via a streaming service on Friday night, the video quality fluctuates—it's the video…
Posted by minje February 5, 2022
Personalized Speech Enhancement
Posted inFeatured Projects Research Projects

Personalized Speech Enhancement

(Download Interspeech 2022 Tutorial Slides) The outstanding development in modern AI has relied greatly on the improved modeling capacity. The deep learning models, for example, are effective in scaling up…
Posted by minje June 12, 2021
Psychoacoustic Loss Functions for Neural Audio Coding
Posted inFeatured Projects Research Projects

Psychoacoustic Loss Functions for Neural Audio Coding

[latexpage] Neural audio coding is an area where we want to compress an audio signal down to a bitstring, which should be recovered as another audio signal that sounds as…
Posted by minje October 5, 2020

Posts pagination

1 2 Next page

Search

Follow Me

    
Scroll to Top

Table of Contents

×
  • The Paper
  • Source Codes
  • Decoded Samples
    • Low Bitrates, 32 kHz
      • Low bitrates example #1
      • Low bitrates example #2
      • Low bitrates example #3
    • High Bitrates, 44.1 kHz
      • High bitrates example #1
      • High bitrates example #2
      • High bitrates example #3
→ Index