Publication

Underscored names are students advised by me.

2024

  • Tsun-An Hsieh, Heeyoul Choi, and Minje Kim,
    Multimodal Representation Loss Between Timed Text and Audio for Regularized Speech Separation,”
    in Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech),
    Kos Island, Greece, Sep. 1-5, 2024.
    [pdf]
  • Haici Yang, Jiaqi Su, Minje Kim, and Zeyu Jin,
    Genhancer: High-Fidelity Speech Enhancement via Generative Modeling on Discrete Codec Tokens,”
    in Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech),
    Kos Island, Greece, Sep. 1-5, 2024.
    [pdf]
  • Minje Kim and Trausti Kristjansson,
    Scalable and Efficient Speech Enhancement Using Modified Cold Diffusion: a Residual Learning Approach,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Seoul, Korea, Apr. 14-19, 2024
    [pdf, demo].
  • Haici Yang, Inseon Jang, and Minje Kim,
    Generative De-Quantization for Neural Speech Codec via Latent Diffusion,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Seoul, Korea, Apr. 14-19, 2024
    [pdf, demo, code].
  • Kahyun Choi and Minje Kim,
    A Comparative Analysis of Poetry Reading Audio: Singing, Narrating, or Somewhere In Between?
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Seoul, Korea, Apr. 14-19, 2024
    [pdf, code]
  • Inseon Jang, Haici Yang, Wootaek Lim, Seungkwon Beack, and Minje Kim,
    Personalized Neural Speech Codec,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Seoul, Korea, Apr. 14-19, 2024
    [pdf, demo]
  • Darius Petermann and Minje Kim,
    Hyperbolic Distance-Based Speech Separation,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Seoul, Korea, Apr. 14-19, 2024
    [pdf, demo]
  • Sunwoo Kim, Mrudula Athi, Guangji Shi, Minje Kim, and Trausti Kristjansson,
    Zero-Shot Test-Time Adaptation Via Knowledge Distillation for Personalized Speech Denoising and Dereverberation,”
    Journal of Acoustical Society of America,
    Vol. 155, No. 2, pp 1353-1367, Feb. 2024
    [pdf] [WASPAA 2021 supplementary material: code, demo, presentation video]

2023

  • Tevin Williams, Matthew Setzler, Minje Kim, Rachel Ryskin, Michael Spivey, and Tyler Marghetis,
    Professional Jazz Musicians Explore and Exploit a Space of Sounds,”
    in Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci),
    Vol. 46, Dec. 2023.
    [pdf]
  • Anastasia Kuznetsova, Aswin Sivaraman, and Minje Kim,
    The Potential of Neural Speech Synthesis-Based Data Augmentation for Personalized Speech Enhancement,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Rhodes Island, Greece, June 4-10, 2023
    [pdf, presentation video]
  • Darius Petermann, Inseon Jang, and Minje Kim,
    Native Multi-Band Audio Coding within Hyper-Autoencoded Reconstruction Propagation Networks,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Rhodes Island, Greece, June 4-10, 2023
    [pdf, demo, code, presentation video]
  • Haici Yang, Wootaek Lim, and Minje Kim,
    Neural Feature Predictor and Discriminative Residual Coding for Low-Bitrate Speech Coding,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Rhodes Island, Greece, June 4-10, 2023
    [pdf, code, presentation video]

2022

  • Aswin Sivaraman and Minje Kim,
    Efficient Personalized Speech Enhancement through Self-Supervised Learning,”
    IEEE Journal of Selected Topics in Signal Processing,
    vol. 16, no. 6, pp. 1342-1356, Oct. 2022
    [pdf, demo, presentation video]
    (Also presented at ICASSP 2023)
  • Sunwoo Kim and Minje Kim,
    Boosted Locality Sensitive Hashing: Discriminative, Efficient, and Scalable Binary Codes for Source Separation,”
    IEEE/ACM Transactions on Audio, Speech, and Language Processing,
    vol. 30, pp. 2659-2672, Aug. 2022
    [pdf, demo, code, presentation video]
    (Also presented at ICASSP 2023)
  • Darius Petermann and Minje Kim,
    SpaIn-Net: Spatially-Informed Stereophonic Music Source Separation,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Singapore, May 22-27, 2022
    [pdf, demo, code, presentation video]
  • Sunwoo Kim and Minje Kim,
    BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Singapore, May 22-27, 2022
    [pdf, demo, code, presentation video]
  • Haici Yang, Shivani Firodiya, Nicholas J. Bryan, and Minje Kim,
    Don’t Separate, Learn to Remix: End-to-End Neural Remixing with Joint Optimization,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Singapore, May 22-27, 2022
    [pdf, demo, code, presentation video]
  • Haici Yang, Sanna Wager, Spencer Russell, Mike Luo, Minje Kim, and Wontak Kim,
    Upmixing Via Style Transfer: a Variational Autoencoder for Disentangling Spatial Images and Musical Content,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Singapore, May 22-27, 2022
    [pdf, demo, presentation video]
  • Hao Zhang, Srivatsan Kandadai, Harsha Rao, Minje Kim, Tarun Pruthi, and Trausti Kristjansson,
    Deep Adaptive AEC: Hybrid of Deep Learning and Adaptive Acoustic Echo Cancellation,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Singapore, May 22-27, 2022
    [pdf].
  • Kai Zhen, Jongmo Sung, Mi Suk Lee, Seungkwon Beack, and Minje Kim,
    Scalable and Efficient Neural Speech Coding: A Hybrid Design,“
    IEEE/ACM Transactions on Audio, Speech, and Language Processing,
    vol. 30, pp. 12-25, 2022
    [pdf].

2021

  • Darius Petermann, Seungkwon Beack, and Minje Kim,
    HARP-Net: Hyper-Autoencoded Reconstruction Propagation for Scalable Neural Audio Coding,”
    in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA),
    New Paltz, NY, Oct. 17-20, 2021
    [pdf, code, demo, presentation video]
  • Aswin Sivaraman and Minje Kim,
    Zero-Shot Personalized Speech Enhancement Through Speaker-Informed Model Selection,”
    in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA),
    New Paltz, NY, Oct. 17-20, 2021
    [pdf, code, presentation video]
  • Sunwoo Kim and Minje Kim,
    Test-Time Adaptation Toward Personalized Speech Enhancement: Zero-Shot Learning With Knowledge Distillation,”
    in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA),
    New Paltz, NY, Oct. 17-20, 2021
    [pdf, code, demo, presentation video]
  • Aswin Sivaraman, Sunwoo Kim, and Minje Kim,
    Personalized Speech Enhancement through Self-Supervised Data Augmentation and Purification,”
    in Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech),
    Brno, Czech Republic, Aug. 30 – Sep. 3, 2021
    [pdf, code, presentation video]
  • R. David Badger, Kristopher H. Jung, and Minje Kim,
    An Open-Sourced Time-Frequency Domain RF Classification Framework,”
    in Proceedings of the 29th European Signal Processing Conference (EUSIPCO),
    Dublin, Ireland, Aug. 23-27, 2021
    [pdf, code, dataset, presentation video]
  • R. David Badger and Minje Kim,
    Singular Value Decomposition for Compression of Large-Scale Radio Frequency Signals,”
    in Proceedings of the 29th European Signal Processing Conference (EUSIPCO),
    Dublin, Ireland, Aug. 23-27, 2021
    [pdf, code, dataset, presentation video]
  • Vibhatha Abeykoon, Geoffrey Fox, Minje Kim, Saliya Ekanayake, Supun Kamburugamuve, Kannan Govindarajan, Pulasthi Wickramasinghe, Niranda Perera, Chathura Widanage, Ahmet Uyar, Gurhan Gunduz, and Selahatin Akkas,
    Stochastic gradient descent‐based support vector machines training optimization on Big Data and HPC frameworks,”
    Concurrency and Computation Practice Experience,
    2021;e6292.
    https://doi.org/10.1002/cpe.6292 [pdf]
  • Haici Yang, Kai Zhen, Seungkwon Beack, and Minje Kim,
    Source-Aware Neural Speech Coding for Noisy Speech Compression,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Toronto, Canada, June 6-12, 2021
    [pdf, demo, code, presentation video]

2020

  • Kai Zhen, Mi Suk Lee, Jongmo Sung, Seungkwon Beack, and Minje Kim,
    Psychoacoustic Calibration of Loss Functions for Efficient End-to-End Neural Audio Coding,”
    IEEE Signal Processing Letters, vol 27, pp. 2159-2163, 2020. [pdf, demo, code, presentation video]
    (Also presented at ICASSP 2022)
  • Aswin Sivaraman and Minje Kim, “Sparse Mixture of Local Experts for Efficient Speech Enhancement,”
    in Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech),
    Shanghai, China, October 25-29, 2020
    <Winner of the Travel Grant>
    [pdf, demo, code, presentation video]
  • Sanna Wager, George Tzanetakis, Cheng-i Wang, and Minje Kim,
    Deep Autotuner: A Pitch Correcting Network for Singing Performances,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Barcelona, Spain, May 4-8, 2020
    [pdf, demo, code, presentation video]
  • Sunwoo Kim, Haici Yang, and Minje Kim,
    Boosted Locality Sensitive Hashing: Discriminative Binary Codes for Source Separation,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Barcelona, Spain, May 4-8, 2020
    <Finalist for the Best Student Paper Award>
    [pdf, demo, code, presentation video]
  • Kai Zhen, Mi Suk Lee, Jongmo Sung, Seungkwon Beack, and Minje Kim,
    Efficient and Scalable Neural Residual Waveform Coding with Collaborative Quantization,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Barcelona, Spain, May 4-8, 2020
    [pdf, demo, code, presentation video]
  • Kai Zhen, Mi Suk Lee, and Minje Kim,
    A Dual-Staged Context Aggregation Method Towards Efficient End-to-End Speech Enhancement,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Barcelona, Spain, May 4-8, 2020
    [pdf, demo, presentation video]
  • Qian Lou, Feng Guo, Minje Kim, Lantao Liu, and Lei Jiang,
    AutoQ: Automated Kernel-Wise Neural Network Quantization“,
    in Proceedings of the International Conference on Learning Representations (ICLR),
    Addis Ababa, Ethiopia, Apr. 26-30, 2020
    [pdf]

2019

  • Kai Zhen, Jongmo Sung, Mi Suk Lee, Seungkwon Beack, and Minje Kim,
    Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding,”
    in Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech),
    Graz, Austria, September 15-19, 2019
    [pdf, demo, code]
  • Geoffrey Fox, James A. Glazier, JCS Kadupitiya, Vikram Jadhao, Minje Kim, Judy Qiu, James P. Sluka, Endre Somogyi, Madhav Marathe, Abhijin Adiga, Jiangzhuo Chen, Oliver Beckstein, and Shantenu Jha,
    Learning Everywhere: Pervasive Machine Learning for Effective High-Performance Computation,”
    in Proceedings of the IEEE International Workshop on High-Performance Big Data, Deep Learning, and Cloud Computing (HPBDC),
    Rio de Janeiro, Brazil, May 20, 2019
    [pdf]
  • Vibhatha Abeykoon, Geoffrey Fox, and Minje Kim,
    Performance Optimization on Model Synchronization in Parallel Stochastic Gradient Descent Based SVM,”
    in Proceedings of the High Performance Machine Learning Workshop (HPML),
    Cyprus, May 14, 2019
    [pdf]
  • Sunwoo KimMrinmoy Maity, and Minje Kim,
    Incremental Binarization On Recurrent Neural Networks for Single-Channel Source Separation,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Brighton, UK, May 12-17, 2019
    [pdf, code]
  • Sanna Wager, George Tzanetakis, Stefan Sullivan, Cheng-i Wang, John Shimmin, Minje Kim, Perry Cook,
    Intonation: a Dataset of Quality Vocal Performances Refined by Spectral Clustering on Pitch Congruence,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Brighton, UK, May 12-17, 2019
    [pdf, dataset]

2018

  • Michael Bechtel, Elise McEllhiney, Minje Kim and Heechul Yun,
    DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car,”
    in Proceedings of the 24th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA)
    Hakodate, Japan, Aug. 28-31, 2018
    [pdf, code]
  • Sanna Wager and Minje Kim,
    Collaborative speech dereverberation: regularized tensor factorization for crowdsourced multi-channel recordings,”
    in Proceedings of the 26th European Signal Processing Conference (EUSIPCO),
    Rome, Italy, Sep. 3-7, 2018
    [pdf]
  • Matt Setzler, Tyler Marghetis, and Minje Kim,
    Creative leaps in musical ecosystems: early warning signals of critical transitions in professional jazz,”
    in Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSci),
    Madison, WI, Jul. 25-28, 2018
    [pdf]
  • Lijiang Guo and Minje Kim,
    Bitwise Source Separation on Hashed Spectra: An Efficient Posterior Estimation Scheme Using Partial Rank Order Metrics,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
    Calgary, Canada, Apr. 15-20, 2018
    [pdf]
  • Minje Kim and Paris Smaragdis,
    Bitwise Neural Networks for Efficient Single-Channel Source Separation,” 
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
    Calgary, Canada, Apr. 15-20, 2018
    [pdf, demo]

2017

  • Lei Jiang, Minje Kim, Wujie Wen and Danghui Wang,
    XNOR-POP: A Processing-in-Memory Architecture for Binary Convolutional Neural Networks in Wide-IO2 DRAMs,”
    in Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED),
    Taipei, Taiwan, Jul. 24-26, 2017
    [pdf]
  • Hongwei Wang, Yunlong Gao, Shaohan Hu, Shiguang Wang, Renato Mancuso, Minje Kim, Poliang Wu, Lu Su, Lui Sha, and Tarek Abdelzaher,
    On Exploiting Structured Human Interactions to Enhance Sensing Accuracy in Cyber-physical Systems,” 
    ACM Transactions on Cyber-Physical Systems
    vol. 1, no. 3, article 16, pp. 16:1-16:19 Jul. 2017.
    [pdf]
  • Minje Kim,
    Collaborative Deep Learning for Speech Enhancement: A Run-Time Model Selection Method Using Autoencoders,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    New Orleans, LA, Mar. 5-9, 2017
    [pdf]
  • Sanna Wager, Liang Chen, Minje Kim, and Christopher Raphael,
    Towards Expressive Instrument Synthesis Through Smooth Frame-By-Frame Reconstruction: From String To Woodwind,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    New Orleans, LA, Mar. 5-9, 2017. 
    [pdf, demo]

2016

  • Minje Kim and Paris Smaragdis,
    Efficient Neighborhood-Based Topic Modeling for Collaborative Audio Enhancement on Massive Crowdsourced Recordings,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Shanghai, China, March 20-25, 2016. [pdf]

2015

  • Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, and Paris Smaragdis,
    Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation,” 
    IEEE/ACM Transactions on Audio, Speech, and Language Processing,
    vol. 23, no. 12, pp. 2136-2147, Dec. 2015
    <Winner of 2020 IEEE Signal Processing Society Best Paper Award>
    [pdf, democode]
  • Minje Kim and Paris Smaragdis,
    Adaptive Denoising Autoencoders: A Fine-tuning Scheme to Learn from Test Mixtures,”
    in Proceedings of the International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA),
    Liberec, Czech Republic, August 25-28, 2015
    <Finalist for the best student paper on audio signal processing>
    [pdf]
  • Minje Kim, Paris Smaragdis, and Gautham J. Mysore,
    Efficient Manifold Preserving Audio Source Separation Using Locality Sensitive Hashing,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Brisbane, Australia, April 19-24, 2015
    [pdf]
  • Yunlong Gao, Shaohan Hu, Renato Mancuso, Hongwei Wang, Minje Kim, Poliang Wu, Lu Su, Lui Sha, and Tarek Abdelzaher,
    Exploiting Structured Human Interactions to Enhance Estimation Accuracy in Cyber-physical Systems,”
    in Proceedings of the International Conference on Cyber-Physical Systems (ICCPS),
    Seattle, WA, April 14-16, 2015.
    [pdf]
  • Minje Kim and Paris Smaragdis,
    Bitwise Neural Networks,” 
    in Proceedings of the International Conference on Machine Learning (ICML) Workshop on Resource-Efficient Machine Learning,
    Lille, France, Jul. 6-11, 2015.
    [pdf]
  • Minje Kim and Paris Smaragdis,
    Mixtures of Local Dictionaries for Unsupervised Speech Enhancement,” 
    IEEE Signal Processing Letters,
    vol. 22, no. 3, pp. 288 – 292, Mar. 2015.
    [pdf]
    (Also presented at ICASSP 2015)

2014

  • Minje Kim and Paris Smaragdis,
    Collaborative Audio Enhancement: Crowdsourced Audio Recording,” 
    Neural Information Processing Systems (NIPS) Workshop on Crowdsourcing and Machine Learning,
    Montreal, Canada, Dec. 8-13, 2014
    [pdf]
  • Minje Kim and Paris Smaragdis,
    Efficient Model Selection for Speech Enhancement Using a Deflation Method for Nonnegative Matrix Factorization,”
    in Proceedings of the IEEE Global Conference on Signal and Information Processing (Global SIP),
    Atlanta, GA, December 3-5, 2014
    [pdf]
  • Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, and Paris Smaragdis,
    Singing-Voice Separation From Monaural Recordings Using Deep Recurrent Neural Networks,”
    in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR),
    Taipei, Taiwan, Oct. 27-31, 2014
    [pdf, democode]
  • Ding Liu, Paris Smaragdis, and Minje Kim,
    Experiments on Deep Learning for Speech Denoising,”
    in Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech),
    Singapore, September 14-18, 2014
    [pdf]
  • Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, and Paris Smaragdis,
    Deep Learning for Monaural Speech Separation,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Florence, Italy, May 4-9, 2014
    [pdf, democode, Bib]
    <Starkey Signal Processing Research Student Grant>
  • Johannes Traa, Minje Kim, Paris Smaragdis,
    Phase and Level Difference Fusion for Robust Multichannel Source Separation,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Florence, Italy, May 4-9, 2014
    [pdf, bib]

2013

  • Paris Smaragdis and Minje Kim,
    Non-Negative Matrix Factorization for Irregularly-Spaced Transforms,”
    in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA),
    New Paltz, NY, Oct. 20-23, 2013
    [pdf, bib]
  • Minje Kim and Paris Smaragdis,
    Single Channel Source Separation Using Smooth Nonnegative Matrix Factorization with Markov Random Fields,”
    in Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP),
    Southampton, UK, Sep. 22-25, 2013
    [pdf, bib]
  • Minje Kim and Paris Smaragdis,
    Manifold Preserving Hierarchical Topic Models for Quantization and Approximation,”
    in Proceedings of the International Conference on Machine Learning (ICML),
    Atlanta, GA, Jun. 16-21, 2013
    [pdf, bib]
  • Minje Kim and Paris Smaragdis,
    Collaborative Audio Enhancement Using Probabilistic Latent Component Sharing,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Vancouver, BC, Canada, May 26-31, 2013
    [pdf, demobib
    <Google ICASSP Student Travel Grant>
    <Best Student Paper in the Audio and Acoustic Signal Processing (AASP) area>
  • C. Zhang, G.G. Ko, J.W. Choi, S.-N. Tsai, Minje Kim, A.G. Rivera, R. Rutenbar, P. Smaragdis, M.S. Park, V. Narayanan, H. Xin, O. Mutlu, B. Li, L. Zhao, M. Chen, and R. Iyer,
    EMERALD: Characterization of Emerging Applications and Algorithms for Low-power Devices,”
    in Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS),
    Austin, TX, Apr. 21-23, 2013
    [pdf, bib]

2012

  • Minje Kim, Paris Smaragdis, Glenn G. Ko, and Rob A. Rutenbar,
    Stereophonic Spectrogram Segmentation Using Markov Random Fields,”
    in Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP),
    Santander, Spain, Sep. 23-26, 2012
    [pdf, demobib]

2011

  • Seungkwon Beack, Taejin Lee, Minje Kim, and Kyeongok Kang,
    An Efficient Time-Frequency Representation for Parametric-Based Audio Object Coding,” 
    ETRI Journal, vol. 33, no. 6, pp. 945-948, Dec. 2011
    [pdf, bib]
  • Minje Kim, Jiho Yoo, Kyeongok Kang, and Seungjin Choi,
    Nonnegative Matrix Partial Co-Factorization for Spectral and Temporal Drum Source Separation,” 
    IEEE Journal of Selected Topics in Signal Processing,
    vol. 5, no. 6, pp. 1192-1204, Oct. 2011
    [pdf, demobib]
  • Minje Kim, Seungkwon Beack, Keunwoo Choi, and Kyeongok Kang,
    Gaussian Mixture Model for Singing Voice Separation From Stereophonic Music,”
    in Proceedings of the Audio Engineering Society 43rd International Conference (AES Conference),
    Pohang, Korea, Sep. 29 – Oct. 1, 2011
    [pdf, demobib]

2010

  • Minje Kim, Jiho Yoo, Kyeongok Kang, and Seungjin Choi, “
    Blind Rhythmic Source Separation: Nonnegativity and Repeatability,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Dallas, TX, Mar. 14-19, 2010
    [pdf, demo, bib]
  • Jiho Yoo, Minje Kim, Kyeongok Kang, and Seungjin Choi,
    Nonnegative Matrix Partial Co-Factorization for Drum Source Separation,”
    in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
    Dallas, TX, Mar. 14-19, 2010
    [pdf, demo, bib]

2008

  • Minje Kim, Seungkwon Beack, Taejin Lee, Daeyoung Jang, and Kyeongok Kang,
    Segmented Dimensionality Reduction Coding on Frequency Domain Signal,”
    in Proceedings of the Audio Engineering Society 34th International Conference (AES Conference),
    Jeju Island, Korea, Aug. 28-30, 2008
    [pdf, bib]

2007

  • Minje Kim, Minsik Park, Seung-jun Yang, Ji Hoon Choi, and Han-kyu Lee,
    System Aspects of TV-Anytime Metadata Codec in a Uni-directional Broadcasting Environment,”
    in Proceedings of the IEEE International Symposium on Consumer Electronics (ISCE),
    Dallas, TX, Jun. 20-23, 2007
    [pdf, bib]
  • Seung-jun Yang, Jung Won Kang, Dong-San Jun, Minje Kim, and Han-kyu Lee,
    TV-Anytime Metadata Authoring Tool for Personalized Broadcasting Services,”
    in Proceedings of the IEEE International Symposium on Consumer Electronics (ISCE),
    Dallas, TX, Jun. 20-23, 2007
    [pdf, bib]

2006

  • Minje Kim and Seungjin Choi,
    ICA-based clustering for resolving permutation ambiguity in frequency-domain convolutive source separation,”
    in Proceedings of the IEEE International Conference on Pattern Recognition (ICPR),
    Hong Kong, Aug. 20-24, 2006.
    [pdf, bib]
  • Minje Kim and Seungjin Choi,
    Monaural Music Source Separation: Nonnegativity, Sparseness, and Shift-invariance,”
    in Proceedings of the International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA),
    pp. 617-624, Charleston, SC, Mar. 5-8, 2006, (LNCS 3889)
    [pdf, demobib]

2005

  • Minje Kim and Seungjin Choi,
    On Spectral Basis Selection for Single Channel Polyphonic Music Separation,”
    in Proceedings of the International Conference on Artificial Neural Networks (ICANN),
    Warsaw, Poland, Sep. 11-15 2005, (LNCS 3697)
    [pdf, demo, bib].

Ph.D. Dissertation

  • Minje Kim,
    Audio Computing in the Wild: Frameworks for Big Data and Small Computers,”
    Ph.D. Dissertation, Department of Computer Science, University of Illinois at Urbana-Champaign, May, 2016
    [pdf]

M.S. Thesis

  • Minje Kim,
    Monaural Music Source Separation: Nonnegativity, Sparseness, and Shift-Invariance,”
    Master’s Thesis, Department of Computer Science and Engineering, POSTECH, Feb, 2006
    [pdf]

Supervised Ph.D. Dissertation

  • R. David Badger,
    Open-Source Classification Systems for Frequency-Domain RF Signals: Robust Physical Layer Multi-Sample Rate Processing,”
    Ph.D. Dissertation, Department of Intelligent Systems Engineering, Indiana University, June, 2022
    [pdf]
  • Sunwoo Kim,
    Model Compression for Efficient Machine Learning Inference,”
    Ph.D. Dissertation, Department of Intelligent Systems Engineering, Indiana University, June, 2022
    [pdf]
  • Kai Zhen,
    Neural Waveform Coding: Scalability, Efficiency, and Psychoacoustic Calibration,”
    Ph.D. Dissertation, Department of Computer Science and Cognitive Science Program, Indiana University, May, 2021.
    <Winner of the Outstanding Research Award (IU Cognitive Science)>
    [pdf]
  • Sanna Wager,
    A Data-Driven Pitch Correction Algorithm for Singing Voice,”
    Ph.D. Dissertation, Department of Informatics, Indiana University, May, 2021
    [pdf]

Talks, Posters, Other Presentations

Tutorials

  • Minje Kim and Jan Skoglund, “Neural Speech and Audio Coding,” Interspeech 2024 Tutorial, Sep. 1, 2024
  • Minje Kim, “Personalized Speech Enhancement: Data- and Resource-Efficient Machine Learning,” Interspeech 2022 Tutorial, Sep. 18, 2022 [slides]

Invited Talks

  • “Improving Scalability, Efficiency, Personalization, and Interactivity in Audio Processing,” Yamaha, Hamamatsu, Japan, Apr. 11, 2024
  • “Revamping latent variable analysis for speech and audio processing,” Academia Sinica, Taipei, Taiwan, Apr. 9, 2024
  • “Personalized AI for Speech Enhancement and Music Applications,” GIST, Gwangju, Korea, Jun. 1, 2023
  • “Personalized AI for Speech Enhancement and Music Applications,” Sogang University, Seoul, Korea, May 26, 2023
  • “Data- and Resource-Efficient Machine Learning for Personalized Speech Enhancement,” Johns Hopkins University, Center for Language and Speech Processing, Baltimore, MD, USA, Dec. 2, 2022
  • “Data- and Resource-Efficient Machine Learning for Personalized Speech Enhancement,” Samsung Research, Korea, May 26, 2022
  • “Latent Representations for Audio Music Signal Processing,” Graduate School of Culture Technology, KAIST, Daejeon, Korea, May 20, 2022
  • “Data Efficiency and Privacy Preservation for Personalized Machine Learning Models: from the Perspective of Audio Applications,” POSTECH, Pohang, Korea, Dec. 15, 2021
  • “Toward Scalable, Efficient, and Perceptually Meaningful Neural Waveform Coding,” Fraunhofer IIS, Erlangen, Germany, Dec. 3, 2021
  • “Data Efficiency and Privacy Preservation for Personalized Machine Learning Models: from the Perspective of Audio Applications,” School of Computer Science and Electrical Engineering, Handong Global University, Korea, Mar. 31, 2021
  • “Efficient Neural Audio Processing Models,” Dept. of Electrical and Computer Engineering, University of Rochester, Rochester, NY, Dec. 11, 2019
  • A half-day seminar at Amazon Lab126, Sunnyvale, CA, Dec. 6, 2019
  • “Audio Computing in the Wild: Frameworks for Collaborative and Efficient AI,” Department of Music and
  • Performing Arts Professions and Center for Data Science, New York University, Mar. 19, 2018
  • “Using Bitwise Machine Learning Models for Resource-Constrained Edge Devices,” Int’l Conf. on Parallel
  • Architectures and Compilation Techniques (PACT) Workshop on Computational Intelligence and Soft Computing (CISC 2017), Sep. 10, 2017
  • “Bitwise Deep Recurrent Neural Networks for Efficient Context-Aware Pervasive Systems,” Intel Labs., Hillsboro, OR, Aug. 16, 2017
  • “Audio Computing in the Wild: Frameworks for Big Data and Small Computers,” Graduate School of Culture Technology, KAIST, Daejeon, Korea, Oct. 7, 2016
  • “Audio Computing in the Wild: Frameworks for Big Data and Small Computers,” Graduate School of Convergence Science and Technology, Seoul National University, Suwon, Korea, Oct. 6, 2016
  • “Audio Computing in the Wild: Frameworks for Big Data and Small Computers,” Qualcomm Korea, Seoul, Korea, Oct. 6, 2016
  • “Audio Computing in the Wild: Frameworks for Big Data and Small Computers,” Hanyang University, Seoul, Korea, Apr. 6, 2016
  • “Audio Computing in the Wild: Frameworks for Big Data and Small Computers,” ETRI, Daejeon, Korea, Mar. 29, 2016
  • “Audio Computing in the Wild: Frameworks for Big Data and Small Computers,” Naver Labs, Seongnam, Korea, Mar. 29, 2016
  • “Audio Computing in the Wild: Frameworks for Big Data and Small Computers,” Google, Mountain View, CA, Mar. 9, 2016
  • “Audio Computing in the Wild: Frameworks for Big Data and Small Computers,” School of Informatics and Computing, Indiana University, Bloomington, IN, Feb. 29, 2016
  • “Audio Computing in the Wild: Frameworks for Big Data and Small Computers,” Lyric Labs, Analog Devices, Cambridge, MA, Feb. 23, 2016
  • “Audio Computing in the Wild: Frameworks for Big Data and Small Computers,” Adobe Research, San Francisco, CA, Feb. 10, 2016
  • “Audio Computing in the Wild: Frameworks for Big Data and Small Computers,” IBM T. J. Watson Research Center, Yorktown Heights, NY, Jan. 7, 2016
  • “Music Source Separation: Spectrogram Factorization,” Sejong University, Seoul, Korea, Jun. 10, 2011

Panels

Talks at Non-Archival Venues

  • Aswin Sivarman, Minje Kim, “Self-Supervised Learning from Contrastive Mixtures for Personalized Speech Enhancement,” NeurIPS 2020 Self-Supervised Learning for Speech and Audio Processing Workshop, Dec. 11, 2020
  • Minje Kim, “Deep Autotuner: A Data-Driven Approach to Natural-Sounding Pitch Correction for Singing Voice in Karaoke Performances,” Midwest Music and Audio Day, Bloomington, IN, Jun. 27, 2019
  • Kai Zhen, Minje Kim, “On psychoacoustically weighted cost functions towards resource-efficient deep neural networks for speech denoising,” Seventh Annual Midwest Cognitive Science Conference, Bloomington, IN, May 12, 2018
  • Glenn Kastern and Sanna Wager, “Learning the pulse: statistical and ML analysis of real-time audio performance logs,” Audio Developer Conference, Nov. 14, 2017
  • Minje Kim, “Bitwise Source Separation,” Midwest Music and Audio Day, Northwestern University, Evanston, IL, Jun. 23, 2017

Poster Presentations at Non-Archival Venues

  • Anastasia Kuznetsova, Aswin Sivaraman, and Minje Kim, “The Potential of Neural Speech Synthesis-Based Data Augmentation for Personalized Speech Enhancement,” Speech and Audio in the Northeast (SANE) 2023, Oct. 26, 2023
  • Aswin Sivaraman, Minje Kim, “Efficient Personalized Speech Enhancement through Self-Supervised Learning,“ Speech and Audio in the Northeast (SANE) 2022, Oct. 6, 2022
  • David Badger, “Radio Frequency Machine Learning in the Time-Frequency Domain,” Autonomous 2.0, WestGate Academy, Crane, IN, Mar. 1, 2022.
  • Minje Kim, “Bitwise Source Separation on Hashed Spectra: An Efficient Posterior Estimation Scheme Using Partial Rank Order Metrics,“ Speech and Audio in the Northeast (SANE) 2018, Oct. 18, 2018
  • Minje Kim, U.S. Air Force Science and Technology 2030, Bloomington, IN, May 10, 2018
  • Lijiang Guo, Minje Kim, “Bitwise Source Separation on Hashed Spectra: An Efficient Posterior Estimation Scheme Using Partial Rank Order Metrics,“ NIPS 2017 workshop on Machine Learning for Audio, Dec. 8, 2017
  • Minje Kim, “Bitwise Neural Networks for Efficient Single Channel Source Separation,“ NIPS 2017 workshop on Machine Learning for Audio, Dec. 8, 2017
  • Aswin Sivaraman, Kai Zhen, Minje Kim, IEEE EnCON, Indiana University, Bloomington, IN, Nov. 10-11, 2017
  • Minje Kim, Paris Smaragdis, “Bitwise Neural Networks for Source Separation,” Speech and Audio in the Northeast (SANE) Workshop, New York, NY, Oct. 22, 2015
  • Minje Kim, Paris Smaragdis, “Probabilistic Latent Component Sharing for the Separation of Non-Orthogonally Overlapping Sources,” Speech and Audio in the Northeast (SANE) Workshop, New York, NY, Oct. 24, 2013
  • Minje Kim et al., Intel Science and Technology Center – Embedded Computing (ISTC-EC) Workday, Apr. 4-5, 2012

Internal Talks

  • Minje Kim, “Tackling Data Efficiency Issues for Personalized Speech Enhancement,” ISE Colloquium Talk, Dept. of Intelligent Systems Engineering, Indiana University, Bloomington, IN, Apr. 2, 2021
  • Minje Kim, “Personalized Speech Enhancement: Test-Time Adaptation Using No or Few Private Data,” AI Talk Series, Luddy School of Informatics and Computing, Indiana University, Bloomington, IN, Sep. 15, 2020
  • Minje Kim, Data Science Online Immersion Weekend, Indiana University, Bloomington, IN, Mar. 3, 2018
  • Minje Kim, “Efficient Machine Learning Models: Binarization and Network Compression,” Intelligent & Interactive Systems Talk Series, School of Informatics and Computing, Indiana University, Bloomington, IN, Feb. 5, 2018
  • Minje Kim, Applied Research Institute Sensor Fusion Workshop, Indiana University, Bloomington, IN, Jun. 2, 2017
  • Minje Kim, “Bitwise Neural Networks,” Indiana University Bloomington/Bielefeld University Cognitive Interaction Technology Workshop, Indiana University, Bloomington, IN, May 17, 2017
  • Minje Kim, IBM CIO’s visit to IU, May 3, 2017
  • Minje Kim, “Bitwise Neural Networks,” Department of Statistics Colloquium Series, Indiana University, Bloomington,IN, Oct. 31, 2016
  • Minje Kim, “Bitwise Neural Networks,” Intelligent & Interactive Systems Talk Series, School of Informatics and Computing, Indiana University, Bloomington, IN, Oct. 31, 2016
  • Minje Kim, “To Make Machines Understand Sound,” Worldwide Youth in Science and Engineering (WYSE) Summer Camp: Discover Engineering, Urbana, IL, Jun. 27, 2016
  • Minje Kim, “Bitwise Neural Networks,” Coordinated Science Laboratory Student Conference, Urbana, Feb. 18-19, 2016
  • Minje Kim, “Bitwise Neural Networks,” Beckman Graduate Seminar, Urbana, IL, Oct. 14, 2015
  • Minje Kim, Lyric Labs, Analog Devices, Cambridge, MA, Jun. 12, 2012
  • Minje Kim, Department of Electrical and Computer Engineering, UIUC (with visitors from Sony, Japan), May 10, 2012

Selected Patents

Out of more than 50 patent applications:

  • “Recurrent multimodal attention system based on expert gated networks,” US Patent App. 16/417,554
  • “Audio Signal Encoding Method and Device, and Audio Signal Decoding Method and Device,” US Patent App. 16/541,959
  • “Audio signal encoding method and apparatus and audio signal decoding method and apparatus using psychoacoustic-based weighted error function,” US Patent App. 16/122,708
  • “Irregular Pattern Identification Using Landmark Based Convolution,“ US Patent No. 10,002,622, 2018
  • “Irregularity detection in music,“ US Patent No. 9,734,844, 2017
  • “Automatic detection of dense ornamentation in music,” US Patent No. 9,514,722, 2016
  • “Pattern Matching of Sound Data Using Hashing,“ US Patent No. 9,449,085, 2016
  • “Multichannel Sound Source Identification and Localization,“ US Patent No. 9,351,093, 2016
  • “Sound Data Identification,“ US Patent No. 9,215,539, 2015.
  • “Method and System for Separating Music Sound Source Using Time and Frequency Characteristics,“ US
    Patent No. 8,563,842, 2013
  • “Method and System for Separating Music Sound Source,“ US Patent No. 8,340,943, 2012
  • “Method and system for separating musical sound source without using sound source database,“ US Patent
    No. 8,080,724, 2011