• Jump to content
  • Jump to navigation
  • Jump to bottom of page
Simulate organization breadcrumb open Simulate organization breadcrumb close
  • FAUTo the central FAU website
  • RRZE
  • NHR-Geschäftsstelle
  • Gauß-Allianz

Navigation Navigation close
  • News
  • People
  • Research
    • Research Focus
    • Publications, Posters and Talks
    • Software & Tools
    • HPC Performance Lab
    • Atomic Structure Simulation Lab
    • NHR PerfLab Seminar
    • Projects
    • Awards
    Portal Research
  • Teaching & Training
    • Lectures and Seminars
    • Tutorials and Courses
    • Theses
    • HPC Cafe
    • Student Cluster Competition
    Portal Teaching
  • Systems & Services
    • Systems, Documentation & Instructions
    • Support & Contact
    • Success Stories from the Support
    • Training Resources
    • Summary of System Utilization
    • Reports from User Projects
    Portal Systems & Services

  1. Home
  2. Systems & Services
  3. User projects
  4. Electrical engineering & Audio processing
  5. HPC User Report from N. Pia (AudioLabs)

HPC User Report from N. Pia (AudioLabs)

In page navigation: Systems & Services
  • Systems, Documentation & Instructions
    • Getting started with HPC
    • NHR application rules – NHR@FAU
    • HPC clusters & systems
      • Dialog server
      • Alex GPGPU cluster (NHR+Tier3)
      • Fritz parallel cluster (NHR+Tier3)
      • Meggie parallel cluster (Tier3)
      • Emmy parallel cluster (Tier3)
      • Woody throughput cluster (Tier3)
      • TinyFat cluster (Tier3)
      • TinyGPU cluster (Tier3)
      • Test cluster
      • Jupyterhub
    • SSH – Secure Shell access to HPC systems
    • File systems
    • Batch Processing
      • Job script examples – Slurm
      • Advanced topics Slurm
      • Torque batch system
    • Software environment
    • Special applications, and tips & tricks
      • Amber/AmberTools
      • ANSYS CFX
      • ANSYS Fluent
      • ANSYS Mechanical
      • Continuous Integration / Gitlab Cx
      • CP2K
      • CPMD
      • GROMACS
      • IMD
      • Intel MKL
      • LAMMPS
      • Matlab
      • NAMD
      • OpenFOAM
      • ORCA
      • Python and Jupyter
      • Quantum Espresso
      • R and R Studio
      • STAR-CCM+
      • Tensorflow and PyTorch
      • TURBOMOLE
      • VASP
        • Request access to central VASP installation
      • Working with NVIDIA GPUs
      • WRF
  • Support & Contact
    • Monthly HPC Cafe
    • HPC Performance Lab
    • Atomic Structure Simulation Lab
    • Support Success Stories
      • Success story: Elmer/Ice
  • HPC User Training
  • HPC System Utilization
  • User projects
    • Biology, life sciences & pharmaceutics
      • HPC User Report from A. Bochicchio (Professorship of Computational Biology)
      • HPC User Report from A. Horn (Bioinformatics)
      • HPC User Report from C. Söldner (Professorship for Bioinformatics)
      • HPC User Report from J. Calderón (Computer Chemistry Center)
      • HPC User Report from J. Kaindl (Chair of Medicinal Chemistry)
      • HPC User Report from K. Pluhackova (Computational Biology Group)
    • Chemical & mechanical engineering
      • HPC User Report from A. Leonardi (Institute for Multiscale Simulation)
      • HPC User Report from F. Lenahan (Institute of Advanced Optical Technologies – Thermophysical Properties)
      • HPC User Report from F. Weber (Chair of Applied Mechanics)
      • HPC User Report from K. Nusser (Institute of Process Machinery and Systems Engineering)
      • HPC User Report from K. Nusser (Institute of Process Machinery and Systems Engineering)
      • HPC User Report from L. Eckendörfer (Catalytic Reactors and Process Technology)
      • HPC User Report from M. Klement (Institute for Multiscale Simulation)
      • HPC User Report from M. Münsch (Chair of Fluid Mechanics)
      • HPC User Report from T. Klein (Institute of Advanced Optical Technologies – Thermophysical Properties)
      • HPC User Report from T. Schikarski (Chair of Fluid Mechanics / Chair of Particle Technology)
      • HPC User Report from U. Higgoda (Institute of Advanced Optical Technologies – Thermophysical Properties)
    • Chemistry
      • HPC User Report from B. Becit (Professorship of Theoretical Chemistry)
      • HPC User Report from B. Meyer (Computational Chemistry – ICMM)
      • HPC User Report from D. Munz (Chair of Inorganic and General Chemistry)
      • HPC User Report from J. Konrad (Professorship of Theoretical Chemistry)
      • HPC User Report from P. Schwarz (Interdisciplinary Center for Molecular Materials)
      • HPC User Report from S. Frühwald (Chair of Theoretical Chemistry)
      • HPC User Report from S. Maisel (Chair of Theoretical Chemistry)
      • HPC User Report from S. Sansotta (Professorship of Theoretical Chemistry)
      • HPC User Report from S. Seiler (Interdisciplinary Center for Molecular Materials)
      • HPC User Report from S. Trzeciak (Professorship of Theoretical Chemistry)
      • HPC User Report from T. Klöffel (Interdisciplinary Center for Molecular Materials)
      • HPC User Report from T. Kollmann (Professorship of Theoretical Chemistry)
    • Computer science & Mathematics
      • HPC User Report from B. Jakubaß & S. Falk (Division of Phoniatrics and Pediatric Audiology)
      • HPC User Report from D. Schuster (Chair for System Simulation)
      • HPC User Report from F. Wein (Professorship for Mathematical Optimization)
      • HPC User Report from J. Hornich (Professur für Höchstleistungsrechnen)
      • HPC User Report from L. Folle and K. Tkotz (Chair of Computer Science 5, Pattern Recognition)
      • HPC User Report from R. Burlacu (Economics, Discrete Optimization, and Mathematics)
      • HPC User Report from S. Falk (Division of Phoniatrics and Pediatric Audiology)
      • HPC User Report from S. Falk (Phoniatrics and Pediatric Audiology)
      • HPC User Report from S. Jacob (Chair of System Simulation)
    • Electrical engineering & Audio processing
      • HPC User Report from N. Pia (AudioLabs)
      • HPC User Report from S. Balke (Audiolabs)
    • Geography & Climatology
      • HPC usage report from F. Temme, J. V. Turton, T. Mölg and T. Sauter
      • HPC usage report from J. Turton, T. Mölg and E. Collier
      • HPC usage report from N. Landshuter, T. Mölg, J. Grießinger, A. Bräuning and T. Peters
      • HPC User Report from C. Pickler and T. Mölg (Climate System Research Group)
      • HPC User Report from E. Collier (Climate System Research Group)
      • HPC User Report from E. Collier and T. Mölg (Climate System Research Group)
      • HPC User Report from E. Collier, T. Sauter, T. Mölg & D. Hardy (Climate System Research Group, Institute of Geography)
      • HPC User Report from E. Kropač, T. Mölg, N. J. Cullen, E. Collier, C. Pickler, and J. V. Turton (Climate System Research Group)
      • HPC User Report from J. Fürst (Department of Geography)
      • HPC User Report from P. Friedl (Department of Geography)
      • HPC User Report from T. Mölg (Climate System Research Group)
    • Linguistics
      • HPC User Report from P. Uhrig (Chair of English Linguistics)
    • Material sciences
      • HPC User Report from A. Rausch (Chair of Materials Science and Engineering for Metals)
      • HPC User Report from D. Wei (Chair of Materials Simulation)
      • HPC User Report from J. Köpf (Chair of Materials Science and Engineering for Metals)
      • HPC User Report from P. Baranova (Chair of General Materials Properties)
      • HPC User Report from S. Nasiri (Chair for Materials Simulation)
      • HPC User Report from S.A. Hosseini (Chair for Materials Simulation)
    • Medical research
      • HPC User Report from H. Sadeghi (Phoniatrics and Pediatric Audiology)
      • HPC User Report from P. Ritt (Imaging and Physics Group, Clinic of Nuclear Medicine)
      • HPC User Report from S. Falk (Division of Phoniatrics and Pediatric Audiology)
    • Physics
      • HPC User Report from D. Jankowsky (High-Energy Astrophysics)
      • HPC User Report from M. Maiti (Inst. Theoretische Physik 1)
      • HPC User Report from N. Vučemilović-Alagić (PULS group of the Physics Department)
      • HPC User Report from O. Malcioglu (Theoretische Festkörperphysik)
      • HPC User Report from S. Fey (Chair of Theoretical Physics I)
      • HPC User Report from S. Ninova (Theoretical Solid-State Physics)
      • HPC User Report from S. Schmidt (Erlangen Centre for Astroparticle Physics)
    • Regional users and student projects
      • HPC User Report from Dr. N. Ferruz (University of Bayreuth)
      • HPC User Report from J. Martens (Comprehensive Heart Failure Center / Universitätsklinikum Würzburg)
      • HPC User Report from M. Fritsche (HS-Coburg)
      • HPC User Report from M. Heß (TH-Nürnberg)
      • HPC User Report from M. Kögel (TH-Nürnberg)
  • NHR compute time projects

HPC User Report from N. Pia (AudioLabs)

Efficient high-quality neural speech coding at low bit rate

Contact:

Nicola Pia
International Audio Laboratories Erlangen
Friedrich-Alexander-Universität Erlangen-Nürnberg & Fraunhofer-Institut für Integrierte Schaltungen (IIS)

Mainly used HPC resources at RRZE

Alex Cluster

Very low bit rate speech coding is very challenging with classical coding techniques. Recently neural networks have started to fill this gap. In this work, we design Generative Adversarial Networks for coding of high-quality speech at low bit rates and low complexity. In particular we focus on how to reduce the model computational complexity, which enables its deployment on edge devices.

Motivation and problem definition

Speech coding enables the compression of speech waveform for communication and many other applications. With classical techniques it is possible to produce intelligible speech at very low bit rate, but this sounds robotic and unnatural. Neural vocoders such as WaveNet can produce high-quality speech from highly compressed inputs.

These models permit to approach the problem of speech coding from the data-driven perspective. Most of the solutions that can be found in literature suffer various disadvantages, which make them not suitable for the deployment in real-world scenarios. The primary issue is often computational complexity and generalization issues, and we set up to solve these.

Methods and codes

The speech coding quality gap and the new neural network solutions.

In our approach, we use Generative Adversarial Networks (GANs) for synthesizing the speech from a compressed bitstream. The bitstream can be either obtained from a classical speech encoder or be learned by an encoder neural network. We implement feature extraction (e.g. mel-spectrogram, MFCC, pitch, …), encoding, quantization (e.g. using classical methods or learned through a vector quantized auto-encoder), and decoding in python. The models are implemented in PyTorch and trained on large speech datasets such us VCTK and LibriTTS.

Results

We can show that our GAN coders can achieve the same speech quality as classical speech codecs using only half of the bit rate or even less. We can configure our model in such a way that enables frame-by-frame generation, which is crucial for low-delay communications scenarios. Finally, we can reduce the complexity of our model using, among other techniques, low rank approximations of the convolutional layers, making it suitable for deployment on edge CPUs.

Outreach

This research is in part a follow-up project to the publication:

  • Mustafa, A., Büthe, J., Korse, S., Gupta, K., Fuchs, G., & Pia, N. (2021, October). A Streamwise Gan Vocoder for Wideband Speech Coding at Very Low Bit Rate. In 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (pp. 66-70). IEEE.

And various other publications are planned.

Researcher’s Bio and Affiliation

Nicola Pia studied mathematics at the University of Cagliari, where he got his PhD under the supervision of Professor Gianluca Bande and Professor Dieter Kotschick from the Ludwig-Maximilian-University Munich. Since 2019, he works as a researcher in the field of AI and speech processing at the AudioLabs at Fraunhofer IIS in Erlangen.

Erlangen National High Performance Computing Center (NHR@FAU)
Martensstraße 1
91058 Erlangen
Germany
  • Imprint
  • Privacy
  • Accessibility
  • How to find us
Up