氏名 | ヤタベ コウヘイ ## 矢田部 浩平 | 職名 | 講師（任期付） | |

所属 | （基幹理工学部） | |||

## 本属以外の学内所属

### 学内研究所等

*波動場・コミュニケーション科学研究所*

研究所員 2017年-

## 学歴・学位

### 学歴

2008年04月-2012年03月 | 早稲田大学 基幹理工学部 表現工学科 |

2012年04月-2014年03月 | 早稲田大学 大学院基幹理工学研究科 表現工学専攻 |

2014年04月-2017年03月 | 早稲田大学 大学院基幹理工学研究科 表現工学専攻 |

### 学位

博士（工学） 早稲田大学

## 経歴

2015年04月-2017年03月 | 日本学術振興会特別研究員 |

2017年04月-2018年03月 | 早稲田大学表現工学科助教 |

2018年04月- | 早稲田大学表現工学科講師（任期付） |

## 委員歴･役員歴(学外)

2013年04月-2019年03月 | 日本音響学会学生・若手フォーラム 幹事会員 |

2018年05月- | 電子情報通信学会信号処理研究専門委員会 |

## 受賞

*第13回 独創研究奨励賞 板倉記念*

2018年03月授与機関：日本音響学会

*早稲田大学ティーチングアワード*

2018年02月授与機関：早稲田大学

*第38回 粟屋潔学術奨励賞*

2015年09月授与機関：日本音響学会

*第8回 学生優秀発表賞*

2014年03月授与機関：日本音響学会

## 研究分野

### キーワード

音響工学，光学計測，信号処理### 科研費分類

情報学 / 人間情報学 / 知覚情報処理

工学 / 機械工学 / 機械力学・制御

工学 / 電気電子工学 / 計測工学

## 論文

*Consistent ICA: Determined BSS meets spectrogram consistency*

Yatabe, Kohei

IEEE Signal Processing Letters査読有り27p.870 - 8742020年05月-2020年05月

掲載種別：研究論文（学術雑誌）

概要：Multichannel audio blind source separation (BSS) in the determined situation (the number of microphones is equal to that of the sources), or determined BSS, is performed by multichannel linear filtering in the time-frequency domain to handle the convolutive mixing process. Ordinarily, the filter treats each frequency independently, which causes the well-known permutation problem , i.e., the problem of how to align the frequency-wise filters so that each separated component is correctly assigned to the corresponding sources. In this paper, it is shown that the general property of the time-frequency-domain representation called spectrogram consistency can be an assistant for solving the permutation problem.

*Speech enhancement using self-adaptation and multi-head self-attention*

Koizumi, Yuma; Yatabe, Kohei; Delcroix, Marc; Masuyama, Yoshiki; Takeuchi, Daiki

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.181 - 1852020年05月-2020年05月

掲載種別：研究論文（国際会議プロシーディングス）

概要：This paper investigates a self-adaptation method for speech enhancement using auxiliary speaker-aware features; we extract a speaker representation used for adaptation directly from the test utterance. Conventional studies of deep neural network (DNN)-based speech enhancement mainly focus on building a speaker independent model. Meanwhile, in speech applications including speech recognition and synthesis, it is known that model adaptation to the target speaker improves the accuracy. Our research question is whether a DNN for speech enhancement can be adopted to unknown speakers without any auxiliary guidance signal in test-phase. To achieve this, we adopt multi-task learning of speech enhancement and speaker identification, and use the output of the final hidden layer of speaker identification branch as an auxiliary feature. In addition, we use multi-head self-attention for capturing long-term dependencies in the speech and noise. Experimental results on a public dataset show that our strategy achieves the state-of-the-art performance and also outperform conventional methods in terms of subjective quality.

*Maximally energy-concentrated differential window for phase-aware signal processing using instantaneous frequency*

Kusano, Tsubasa; Yatabe, Kohei; Oikawa, Yasuhiro

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.5825 - 58292020年05月-2020年05月

掲載種別：研究論文（国際会議プロシーディングス）

概要：The short-time Fourier transform (STFT) is widely employed in non-stationary signal analysis, whose property depends on window functions. Instantaneous frequency in STFT, the time-derivative of phase, is recently applied to many applications including spectrogram reassignment. The computation of instantaneous frequency requires STFT with the window and STFT with the (time-)differential window, i.e., the computation of instantaneous frequency depends on both the window function and its time derivative. To obtain the instantaneous frequency accurately, the sidelobe of frequency response of differential window should be reduced because the side-lobe causes mixing of multiple components. In this paper, we propose window functions suitable for computing the instantaneous frequency which are designed based on minimizing the sidelobe energy of the frequency response of the differential window.

*Phase reconstruction based on recurrent phase unwrapping with deep neural networks*

Masuyama, Yoshiki; Yatabe, Kohei; Koizumi, Yuma; Oikawa, Yasuhiro; Harada, Noboru

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.826 - 8302020年05月-2020年05月

掲載種別：研究論文（国際会議プロシーディングス）

概要：Phase reconstruction, which estimates phase from a given amplitude spectrogram, is an active research field in acoustical signal processing with many applications including audio synthesis. To take advantage of rich knowledge from data, several studies presented deep neural network (DNN)–based phase reconstruction methods. However, the training of a DNN for phase reconstruction is not an easy task because phase is sensitive to the shift of a waveform. To overcome this problem, we propose a DNN-based two-stage phase reconstruction method. In the proposed method, DNNs estimate phase derivatives instead of phase itself, which allows us to avoid the sensitivity problem. Then, phase is recursively estimated based on the estimated derivatives, which is named recurrent phase unwrapping (RPU). The experimental results confirm that the proposed method outperformed the direct phase estimation by a DNN.

*Real-time speech enhancement using equilibriated RNN*

Takeuchi, Daiki; Yatabe, Kohei; Koizumi, Yuma; Oikawa, Yasuhiro; Harada, Noboru

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.851 - 8552020年05月-2020年05月

掲載種別：研究論文（国際会議プロシーディングス）

概要：We propose a speech enhancement method using a causal deep neural network (DNN) for real-time applications. DNN has been widely used for estimating a time-frequency (T-F) mask which enhances a speech signal. One popular DNN structure for that is a recurrent neural network (RNN) owing to its capability of effectively modelling time-sequential data like speech. In particular, the long short-term memory (LSTM) is often used to alleviate the vanishing/exploding gradient problem which makes the training of an RNN difficult. However, the number of parameters of LSTM is increased as the price of mitigating the difficulty of training, which requires more computational resources. For real-time speech enhancement, it is preferable to use a smaller network without losing the performance. In this paper, we propose to use the equilibriated recurrent neural network (ERNN) for avoiding the vanishing/exploding gradient problem without increasing the number of parameters. The proposed structure is causal, which requires only the information from the past, in order to apply it in real-time. Compared to the uni- and bi-directional LSTM networks, the proposed method achieved the similar performance with much fewer parameters.

*Invertible DNN-based nonlinear time-frequency transform for speech enhancement*

Takeuchi, Daiki; Yatabe, Kohei; Koizumi, Yuma; Oikawa, Yasuhiro; Harada, Noboru

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.6644 - 66482020年05月-2020年05月

掲載種別：研究論文（国際会議プロシーディングス）

概要：We propose an end-to-end speech enhancement method with trainable time-frequency (T-F) transform based on invertible deep neural network (DNN). The resent development of speech enhancement is brought by using DNN. The ordinary DNN-based speech enhancement employs T-F transform, typically the short-time Fourier transform (STFT), and estimates a T-F mask using DNN. On the other hand, some methods have considered end-to-end networks which directly estimate the enhanced signals without T-F transform. While end-to-end methods have shown promising results, they are black boxes and hard to understand. Therefore, some end-to-end methods used a DNN to learn the linear T-F transform which is much easier to understand. However, the learned transform may not have a property important for ordinary signal processing. In this paper, as the important property of the T-F transform, perfect reconstruction is considered. An invertible nonlinear T-F transform is constructed by DNNs and learned from data so that the obtained transform is perfectly reconstructing filterbank.

*Stable training of DNN for speech enhancement based on perceptually-motivated black-box cost function*

Kawanaka, Masaki; Koizumi, Yuma; Miyazaki, Ryoichi; Yatabe, Kohei

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.7524 - 75282020年05月-2020年05月

掲載種別：研究論文（国際会議プロシーディングス）

概要：Improving subjective sound quality of enhanced signals is one of the most important missions in speech enhancement. For evaluating the subjective quality, several methods related to perceptually-motivated objective sound quality assessment (OSQA) have been proposed such as PESQ (perceptual evaluation of speech quality). However, direct use of such measures for training deep neural network (DNN) is not allowed in most cases because popular OSQAs are non-differentiable with respect to DNN parameters. Therefore, the previous study has proposed to approximate the score of OS-QAs by an auxiliary DNN so that its gradient can be used for training the primary DNN. One problem with this approach is instability of the training caused by the approximation error of the score. To overcome this problem, we propose to use stabilization techniques borrowed from reinforcement learning. The experiments, aimed to increase the score of PESQ as an example, show that the proposed method (i) can stably train a DNN to increase PESQ, (ii) achieved the state-of-the-art PESQ score on a public dataset, and (iii) resulted in better sound quality than conventional methods based on subjective evaluation.

*Time-frequency-masking-based determined BSS with application to sparse IVA*

Yatabe, Kohei; Kitamura, Daichi

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.715 - 7192019年05月-2019年05月

概要：Most of the determined blind source separation (BSS) algorithms related to the independent component analysis (ICA) were derived from mathematical models of source signals. However, such derivation restricts the application of algorithms to explicitly definable source models, i.e., an implicit model associated with some signal-processing procedure cannot be utilized within such framework. In this paper, we propose an extension of the existing algorithm so that any time-frequency masking method (e.g., those developed in speech enhancement literature) can be incorporated into the determined BSS algorithm. As an application of the proposed algorithm, a sparse extension of the well-known independent vector analysis (IVA) is also proposed for illustrating the potentiality of the masking-based implicit source model.

*Data-driven design of perfect reconstruction filterbank for DNN-based sound source enhancement*

Takeuchi, Daiki; Yatabe, Kohei; Koizumi, Yuma; Oikawa, Yasuhiro; Harada, Noboru

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.596 - 6002019年05月-2019年05月

掲載種別：研究論文（国際会議プロシーディングス）

概要：We propose a data-driven design method of perfect-reconstruction filterbank (PRFB) for sound-source enhancement (SSE) based on deep neural network (DNN). DNNs have been used to estimate a time-frequency (T-F) mask in the short-time Fourier transform (STFT) domain. Their training is more stable when a simple cost function as mean-squared error (MSE) is utilized comparing to some advanced cost such as objective sound quality assessments. However, such a simple cost function inherits strong assumptions on the statistics of the target and/or noise which is often not satisfied, and the mismatch of assumption results in degraded performance. In this paper, we propose to design the frequency scale of PRFB from training data so that the assumption on MSE is satisfied. For designing the frequency scale, the warped filterbank frame (WFBF) is considered as PRFB. The frequency characteristic of learned WFBF was in between STFT and the wavelet transform, and its effectiveness was confirmed by comparison with a standard STFT-based DNN whose input feature is compressed into the mel scale.

*Deep Griffin–Lim iteration*

Masuyama, Yoshiki; Yatabe, Kohei; Koizumi, Yuma; Oikawa, Yasuhiro; Harada, Noboru

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.61 - 652019年05月-2019年05月

掲載種別：研究論文（国際会議プロシーディングス）

概要：This paper presents a novel phase reconstruction method (only from a given amplitude spectrogram) by combining a signal-processing-based approach and a deep neural network (DNN). To retrieve a time-domain signal from its amplitude spectrogram, the corresponding phase is required. One of the popular phase reconstruction methods is the Griffin–Lim algorithm (GLA), which is based on the redundancy of the short-time Fourier transform. However, GLA often involves many iterations and produces low-quality signals owing to the lack of prior knowledge of the target signal. In order to address these issues, in this study, we propose an architecture which stacks a sub-block including two GLA-inspired fixed layers and a DNN. The number of stacked sub-blocks is adjustable, and we can trade the performance and computational load based on requirements of applications. The effectiveness of the proposed method is investigated by reconstructing phases from amplitude spectrograms of speeches.

*Phase-aware harmonic/percussive source separation via convex optimization*

Masuyama, Yoshiki; Yatabe, Kohei; Oikawa, Yasuhiro

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.985 - 9892019年05月-2019年05月

掲載種別：研究論文（国際会議プロシーディングス）

概要：Decomposition of an audio mixture into harmonic and percussive components, namely harmonic/percussive source separation (HPSS), is a useful pre-processing tool for many audio applications. Popular approaches to HPSS exploit the distinctive source-specific structures of power spectrograms. However, such approaches consider only power spectrograms, and the phase remains intact for resynthesizing the separated signals. In this paper, we propose a phase-aware HPSS method based on the structure of the phase of harmonic components. It is formulated as a convex optimization problem in the time domain, which enables the simultaneous treatment of both amplitude and phase. The numerical experiment validates the effectiveness of the proposed method.

*Low-rankness of complex-valued spectrogram and its application to phase-aware audio processing*

Masuyama, Yoshiki; Yatabe, Kohei; Oikawa, Yasuhiro

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.855 - 8592019年05月-2019年05月

掲載種別：研究論文（国際会議プロシーディングス）

概要：Low-rankness of amplitude spectrograms has been effectively utilized in audio signal processing methods including non-negative matrix factorization. However, such methods have a fundamental limitation owing to their amplitude-only treatment where the phase of the observed signal is utilized for resynthesizing the estimated signal. In order to address this limitation, we directly treat a complex-valued spectrogram and show a complex-valued spectrogram of a sum of sinusoids can be approximately low-rank by modifying its phase. For evaluating the applicability of the proposed low-rank representation, we further propose a convex prior emphasizing harmonic signals, and it is applied to audio denoising.

*Guided-spatio-temporal filtering for extracting sound from optically measured images containing occluding objects*

Tanigawa, Risako; Yatabe, Kohei; Oikawa, Yasuhiro

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.945 - 9492019年05月-2019年05月

掲載種別：研究論文（国際会議プロシーディングス）

概要：Recent development of optical interferometry enables us to measure sound without placing any device inside the sound field. In particular, parallel phase-shifting interferometry (PPSI) has realized advanced measurement of refractive index of air. Its novel application investigated very recently is simultaneous visualization of flow and sound, which had been difficult until PPSI enabled high-speed and accurate measurement several years ago. However, for understanding aerodynamic sound, separation of air flow and sound is necessary since they are mixed up in the observed video. In this paper, guided-spatio-temporal filtering is proposed to separate sound from the optically measured images. Guided filtering is combined with a physical-model-based spatio-temporal filterbank for extracting sound-related information without the undesired effect caused by the image boundary or occluding objects. Such image boundary and occluding objects are typical difficulty arose in signal processing of an optically measured sound filed.

*Representation of complex spectrogram via phase conversion*

Yatabe, Kohei; Masuyama, Yoshiki; Kusano, Tsubasa; Oikawa, Yasuhiro

Acoustical Science and Technology招待有り40(3)p.170 - 1772019年05月-2019年05月

掲載種別：研究論文（学術雑誌）

概要：As importance of the phase of complex spectrogram has been recognized widely, many techniques have been proposed for handling it. However, several definitions and terminologies for the same concept can be found in the literature, which has confused beginners. In this paper, two major definitions of the short-time Fourier transform and their phase conventions are summarized to alleviate such complication. A phase-aware signal-processing scheme based on phase conversion is also introduced with a set of executable MATLAB functions (https://doi.org/10/c3qb).

*Modal decomposition of musical instrument sounds via optimization-based non-linear filtering*

Masuyama, Yoshiki; Kusano, Tsubasa; Yatabe, Kohei; Oikawa, Yasuhiro

Acoustical Science and Technology査読有り40(3)p.186 - 1972019年05月-2019年05月

掲載種別：研究論文（学術雑誌）

概要：For musical instrument sounds containing partials, which are referred to as modes, the decaying processes of the modes significantly affect the timbre of musical instruments and characterize the sounds. However, their accurate decomposition around the onset is not an easy task, especially when the sounds have sharp onsets and contain the non-modal percussive components such as the attack. This is because the sharp onsets of modes comprise peaky but broad spectra, which makes it difficult to get rid of the attack component. In this paper, an optimization-based method of modal decomposition is proposed to overcome it. The proposed method is formulated as a constrained optimization problem to enforce the perfect reconstruction property which is important for accurate decomposition and causality of modes. Three numerical simulations and application to the real piano sounds confirm the performance of the proposed method.

*Source directivity approximation for finite-difference time-domain simulation by estimating initial value*

Takeuchi, Daiki; Yatabe, Kohei; Oikawa, Yasuhiro

Journal of the Acoustical Society of America査読有り145(4)p.2638 - 26492019年04月-2019年04月

掲載種別：研究論文（学術雑誌）

概要：In order to incorporate a directive sound source into acoustic simulation using the finite-difference time-domain method (FDTD), this paper proposes an optimization-based method to estimate the initial value which approximates a desired directional pattern after propagation. The proposed method explicitly considers a discretized FDTD scheme and optimizes the initial value directly in the time domain so that every effect of the discretization error of FDTD, including numerical dispersion, is taken into account. It is also able to consider a frequency-wise directivity by integrating the Fourier transform into the optimization procedure, even though the estimated result is defined in the time domain. After the optimization, the obtained result can be utilized in any acoustic simulation based on the same FDTD scheme without modification because the result is represented as the initial value to be propagated and no additional procedure is required.

*光学的音響計測*

矢田部浩平, 石川憲治, 谷川理佐子, 及川靖広

電子情報通信学会 Fundamentals Review招待有り12(4)p.259 - 2682019年04月-2019年04月

掲載種別：研究論文（学術雑誌）

概要：音は空気の疎密変化なので，空気の密度に依存する屈折率を光学系で検出することで，音を非接触に録ることができ，実体の存在するマイクロホンでは扱えない音場も光によって計測することができる．本稿では，筆者らがこれまで取り組んできた「空中可聴音の光学的音響計測」の歴史や原理，用いている2種類の干渉計(レーザドップラー振動計・偏光高速度干渉計)の光学的内容などについて概説し，光でしか計測できない実際の音場への適用例を幾つか紹介する．

*Extracting sound from flow measured by parallel phase-shifting interferometry using spatio-temporal filter*

Tanigawa, Risako; Ishikawa, Kenji; Yatabe, Kohei; Oikawa, Yasuhiro; Onuma, Takashi; Niwa, Hayato

Proc. SPIE 10997, Three-Dimensional Imaging, Visualization, and Display 201910997p.10997-1 - 10997-62019年04月-2019年04月

掲載種別：研究論文（国際会議プロシーディングス）

概要：We have proposed a method of simultaneously measuring aerodynamic sound and fluid ow using parallel phase-shifting interferometry (PPSI). PPSI can observe phase of light instantaneously and quantitatively. This method is useful for understanding the aerodynamic sound because PPSI can measure near the source of the aerodynamic sound. However, the components of sound and ow should be separated in order to observe detail near the source of sound inside a region of ow. Therefore, we consider a separation of the component of sound from simultaneously visualized images of sound and ow. In previous research, a spatio-temporal filter was used to extract a component satisfying the wave equation. The ow and the sound are different physical phenomena, and the ow cannot be expressed by the wave equation. Hence, we think that the spatio-temporal filter enables us to separate the component of sound from the simultaneously visualized images. In this paper, we propose a method for separation of ow and sound using spatio-temporal filter in order to visualize the component of the aerodynamic sound near its source. We conducted an experiment of the separation of data measured by PPSI. The results show that the spatio-temporal filter can extract the sound from air-ow except for the sound near objects and boundaries.

*位相変換による複素スペクトログラムの表現*

矢田部浩平, 升山義紀, 草野翼, 及川靖広

日本音響学会誌招待有り75(3)p.147 - 1552019年03月-2019年03月

掲載種別：研究論文（学術雑誌）

*Griffin-Lim like phase recovery via alternating direction method of multipliers*

Masuyama, Yoshiki; Yatabe, Kohei; Oikawa, Yasuhiro

IEEE Signal Processing Letters査読有り26(1)p.184 - 1882019年01月-2019年01月

掲載種別：研究論文（学術雑誌）

概要：Recovering a signal from its amplitude spectrogram, or phase recovery, exhibits many applications in acoustic signal processing. When only an amplitude spectrogram is available and no explicit information is given for the phases, the Griffin-Lim algorithm (GLA) is one of the most utilized methods for phase recovery. However, GLA often requires many iterations and results in low perceptual quality in some cases. In this letter, we propose two novel algorithms based on GLA and the alternating direction method of multipliers (ADMM) for better recovery with fewer iteration. Some interpretation of the existing methods and their relation to the proposed method are also provided. Evaluations are performed with both objective measure and subjective test.

*Visualization system for sound field using see-through head-mounted display*

Inoue, Atsuto; Ikeda, Yusuke; Yatabe, Kohei; Oikawa, Yasuhiro

Acoustical Science and Technology査読有り40(1)p.12019年01月-2019年01月

掲載種別：研究論文（学術雑誌）

概要：For the visualization of a sound field, a widely used method is the superimposition of the sound information onto a camera view. Although it effectively enables the understanding the relationship between space and sound, a planar display cannot resolve depth information in a straightforward manner. In contrast, a see-through head-mounted display (STHMD) is capable of representing three-dimensional (3D) vision and natural augmented reality (AR) or mixed reality (MR). In this paper, we propose a system for the measurement and visualization of a sound field with an STHMD. We created two visualization systems using different types of STHMDs and technologies for realizing AR/MR and a measurement system for a 3D sound intensity map, which can be used together with the visualization system. Through three visualization experiments, we empirically found that the stereoscopic viewing and the convenient viewpoint movement associated with the STHMD enables understanding of the sound field in a short time.

*3D sound source localization based on coherence-adjusted monopole dictionary and modified convex clustering*

Tachikawa, Tomoya; Yatabe, Kohei; Oikawa, Yasuhiro

Applied Acoustics査読有り139p.267 - 2812018年10月-2018年10月

掲載種別：研究論文（学術雑誌）ISSN：0003682X

概要：© 2018 Elsevier Ltd In this paper, a sound source localization method for simultaneously estimating both direction-of-arrival (DOA) and distance from the microphone array is proposed. For estimating distance, the off-grid problem must be overcome because the range of distance to be considered is quite broad and even not bounded. The proposed method estimates the positions based on a modified version of the convex clustering method combined with the sparse coefficients estimation. A method for constructing a suitable monopole dictionary based on the coherence is also proposed so that the convex clustering based method can appropriately estimate distance of the sound sources. Numerical and measurement experiments were performed to investigate the performance of the proposed method.

*Rectified linear unit can assist Griffin-Lim phase recovery*

Yatabe, Kohei; Masuyama, Yoshiki; Oikawa, Yasuhiro

IWAENC, International Workshop on Acoustic Signal Enhancement査読有りp.555 - 5592018年09月-2018年09月

掲載種別：研究論文（国際会議プロシーディングス）

概要：Phase recovery is an essential process for reconstructing a time-domain signal from the corresponding spectrogram when its phase is contaminated or unavailable. Recently, a phase recovery method using deep neural network (DNN) was proposed, which interested us because the inverse short-time Fourier transform (inverse STFT) was utilized within the network. This inverse STFT converts a spectrogram into its time-domain counterpart, and then the activation function, leaky rectified linear unit (ReLU), is applied. Such nonlinear operation in time domain resembles the speech enhancement method called the harmonic regeneration noise reduction (HRNR). In HRNR, a time-domain nonlinearity, typically ReLU, is applied for assistance in enhancing the higher-order harmonics. From this point of view, one question arose in our mind: Can time-domain ReLU solely assist phase recovery? Inspired by this curious connection between the recent DNN-based phase recovery method and HRNR in speech enhancement, the ReLU assisted Griffin–Lim algorithm is proposed in this paper to investigate the above question. Through an experiment of speech denoising with the oracle Wiener filter, some positive effect of the time-domain nonlinearity is confirmed in terms of the scores of the short-time objective intelligibility (STOI).

*Underdetermined source separation with simultaneous DOA estimation without initial value dependency*

Tachikawa, Tomoya; Yatabe, Kohei; Oikawa, Yasuhiro

IWAENC, International Workshop on Acoustic Signal Enhancement査読有りp.161 - 1652018年09月-2018年09月

掲載種別：研究論文（国際会議プロシーディングス）

概要：In this paper, a sparsity-based method for solving an underdetermind source separation problem is proposed. The proposed method is formulated as a convex optimization problem with two kinds of sparsity priors: sparsity in time-frequency domain and direction-of-arrival (DOA). These priors enable simultaneous estimation of DOA and sound sources, while the estimation result does not depend on an initialization method thanks to the convexity. Experiments using 4 sound sources recorded by 2 microphones confirmed that every random initial value in the proposed method resulted in the same performance which was better than the conventional methods.

*Separating stereo audio mixture having no phase difference by convex clustering and disjointness map*

Hiruma, Atsushi; Yatabe, Kohei; Oikawa, Yasuhiro

IWAENC, International Workshop on Acoustic Signal Enhancement査読有りp.266 - 2702018年09月-2018年09月

掲載種別：研究論文（国際会議プロシーディングス）

概要：In this paper, a method of constructing binary masks for stereo source separation is proposed. The proposed method consists of two main factors: (1) Disjointness map, and (2) convex directional clustering. The disjointness map quantifies the degree of mixing at each time-frequency bin based on instantaneous frequencies. This map enables to effectively utilize phase information which is usually omitted in stereo music separation. Then, the convex clustering utilizes flexible definitions of adjacency of the time-frequency bins for incorporating more information into directional clustering. Experimental results indicate that the proposed method can obtain a mask closer to the ideal one than the conventional directional clustering.

*Model-based phase recovery of spectrograms via optimization on Riemannian manifolds*

Masuyama, Yoshiki; Yatabe, Kohei; Oikawa, Yasuhiro

IWAENC, International Workshop on Acoustic Signal Enhancement査読有りp.126 - 1302018年09月-2018年09月

掲載種別：研究論文（国際会議プロシーディングス）

概要：In acoustical signal processing, the importance of modifying the phase spectrogram has been shown. Recently, model-based phase recovery which is based on the sinusoidal model has been studied. Although their effectiveness has been proven, some of them deal with the phase in inflexible forms owing to the wrapping effect of phase. In addition, they need much pre-processing, including the estimation of the instantaneous frequency, which is not easy tasks. In order to overcome these issues, we propose a novel model-based phase recovery method which is formulated as an optimization over complex-valued phases. In the proposed method, the instantaneous frequency is not handled fixedly, which avoids the prior estimation of the instantaneous frequency. The technique of optimization on Riemannian manifolds is adopted for efficient computation. The proposed method is validated by noise reduction of audio signals.

*Optical visualization of sound field inside transparent cavity using polarization high-speed camera*

Ishikawa, Kenji; Yatabe, Kohei; Oikawa, Yasuhiro; Onuma, Takashi; Niwa, Hayato

Proceedings of INTER-NOISE 2018 - 47th International Congress and Exposition on Noise Control Engineering招待有りp.1806-1 - 1806-72018年08月-2018年08月

掲載種別：研究論文（国際会議プロシーディングス）

概要：Visualization of a sound field is a powerful tool for understanding acoustic phenomena. Methods using a microphone array such as beamforming and near-field acoustic holography have widely been studied, and these have been applied to industrial problems. As alternative choices for the visualization of the sound field, optical methods have gained a considerable amount of attention due to their capability of non-intrusive measurement. These include laser Doppler vibrometry, Shadowgraphy, Schlieren method, optical digital holography, and parallel phase-shifting interferometry (PPSI). These methods are well developed for the visualization of propagating sound wave in a free field. Also, as these methods can observe the sound field without installing any instruments into the field to be measured, they have the potential to achieve visualization of sound inside a cavity, which is quite important for duct acoustics. This paper presents the single-shot visualization of the sound field inside a transparent cavity using PPSI with a high-speed camera, as well as the brief review of the development of the optical measurement of sound. For the experiments, the sound field inside of a ported speaker box made by acrylic plates was measured. Acoustics resonances and mode patterns inside the box were successively captured.

*Measurement of sound pressure inside tube using optical interferometry*

Hermawanto, Denny; Ishikawa, Kenji; Yatabe, Kohei; Oikawa, Yasuhiro

Proceedings of INTER-NOISE 2018 - 47th International Congress and Exposition on Noise Control Engineeringp.1688-1 - 1688-112018年08月-2018年08月

掲載種別：研究論文（国際会議プロシーディングス）

概要：Measurement of sound pressure inside a tube is important for duct acoustics and microphone calibration. Inserting microphone directly into sound field will disturb the field and produce inaccurate measurement result. Recently, non-intrusive sound pressure measurement using optical techniques have been proposed. A laser Doppler vibrometer is used to measure line integrals of sound pressure yield projections and a reconstruction technique is then applied to recover the original sound field from projections. In this paper, measurement of sound pressure distribution using optical method is proposed to realize direct pressure measurement for microphone calibration. A simulation of sound field reconstruction from projections using filtered back-projection technique was developed and the performance was evaluated. The reconstruction performance was evaluated for the projection of plane wave and point source wave of frequency from 1000 Hz to 16000 Hz. The implementation of the proposed method for reconstruction of sound field inside an acrylic tube diameter 61.75 mm, length 22 mm, and thickness 3.5 mm for 1000 Hz sound source from projection using laser Doppler vibrometer was performed. The result shows that the proposed method was able to reconstruct the sound field inside tube and measure pressure distribution.

*Optical visualization of sound source of edge tone using parallel phase-shifting interferometry*

Tanigawa, Risako; Ishikawa, Kenji; Yatabe, Kohei; Oikawa, Yasuhiro; Onuma, Takashi; Niwa, Hayato

Proceedings of INTER-NOISE 2018 - 47th International Congress and Exposition on Noise Control Engineeringp.1494-1 - 1494-92018年08月-2018年08月

掲載種別：研究論文（国際会議プロシーディングス）

概要：In order to reduce aerodynamic noise, understanding the nature of aerodynamic sound sources is important. Generally, aerodynamic sound is measured by using microphones. However, microphones should be installed far from aerodynamic sound sources, which makes difficult to understand the nature of aerodynamic sound sources. As non-contact measurement methods, optical measurement methods of sound have been proposed. Among those, parallel phase-shifting interferometry (PPSI) can capture time-varying phenomena. Recently, simultaneous visualization of flow and sound using PPSI has been proposed. This method enables to capture the propagation of sound inside a flow. In this paper, as an application of aerodynamic sound visualization using PPSI, results of visualization of sound sources of edge tones are shown. The nozzle-edge distance and the flow rate, which are parameters of changing frequency of an edge tone, were adjusted from 5 mm to 13 mm at intervals of 1 mm and from 15 L/min to 45 L/min at intervals of 5 L/min, respectively. The frame rate of the high-speed camera in PPSI was set to 20,000 frames per second and the size of the visualization area was 77 mm by 56 mm. From the visualized images, the characteristics of spatial spread of edge tones were observed.

*Optical visualization of a fluid flow via the temperature controlling method*

Tanigawa, Risako; Ishikawa, Kenji; Yatabe, Kohei; Oikawa, Yasuhiro; Onuma, Takashi; Niwa, Hayato

Optics Letters査読有り43(14)p.3273 - 32762018年07月-2018年07月

掲載種別：研究論文（学術雑誌）

概要：© 2018 Optical Society of America. In this Letter, a visualization method of a fluid flow through temperature control is proposed. The proposed method enables us to visualize an invisible fluid flow by controlling the temperature so that its visibility can be easily adjusted. Such ability of adjusting appearance is effective for visualizing the phenomena consisting of multiple physical processes. In order to verify the validity of the proposed method, the measurement experiment of visualization of both flow and sound in air using parallel phase-shifting interferometry, which is a similar condition to the previous research [Opt. Lett. 43, 991 (2018)], was conducted.

*Time-directional filtering of wrapped phase for observing transient phenomena with parallel phase-shifting interferometry*

Yatabe, Kohei; Tanigawa, Risako; Ishikawa, Kenji; Oikawa, Yasuhiro

Optics Express査読有り26(11)p.13705 - 137202018年05月-2018年05月

掲載種別：研究論文（学術雑誌）

概要：© 2018 Optical Society of America. Recent development of parallel phase-shifting interferometry (PPSI) enables accurate measurement of time-varying phase maps. By combining a high-speed camera with PPSI, it became possible to observe not only time-varying but also fast phenomena including fluid flow and sound in air. In such observation, one has to remove static phase (time-invariant or slowly-varying phase unrelated to the phenomena of interest) from the observed phase maps. Ordinarily, a signal processing method for eliminating the static phase is utilized after phase unwrapping to avoid the 2π discontinuity which can be a source of error. In this paper, it is shown that such phase unwrapping is not necessary for the high-speed observation, and a time-directional filtering method is proposed for removing the static phase directly from the wrapped phase without performing phase unwrapping. In addition, experimental results of simultaneously visualizing flow and sound with 42 000 fps are shown to illustrate how the time-directional filtering changes the appearance. A MATLAB code is included within the paper (also in https://goo.gl/N4wzdp) for aiding the understanding of the proposed method.

*Localization of marine seismic vibrator based on hyperbolic Radon transform*

Kusano, Tsubasa; Yatabe, Kohei; Oikawa, Yasuhiro

Acoustical Science and Technology査読有り39(3)p.215 - 2252018年05月-2018年05月

掲載種別：研究論文（学術雑誌）ISSN：13463969

概要：© 2018 The Acoustical Society of Japan. In marine seismic surveys to explore seafloor resources, the structure below the seafloor is estimated from the obtained sound waves, which are emitted by a marine seismic sound source and reflected or refracted between the layers below the seafloor. In order to estimate the structure below the seafloor from returned waves, information of the sound source position and the sound speed are needed. Marine seismic vibrators, which are one of the marine seismic sound sources, have some advantages such as high controllability of the frequency and phase of the sound, and oscillation at a high depth. However, when the sound source position is far from the sea surface, it becomes difficult to specify the exact position. In this paper, we propose a method to estimate the position of a marine seismic vibrator and the sound speed from obtained seismic data by formulating an optimization problem via hyperbolic Radon transform. Numerical simulations confirmed that the proposed method almost achieves theoretical lower bounds for the variances of the estimations.

*Determined blind source separation via proximal splitting algorithm*

Yatabe, Kohei; Kitamura, Daichi

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.776 - 7802018年04月-2018年04月

掲載種別：研究論文（国際会議プロシーディングス）

概要：The state-of-the-art algorithms of determined blind source separation (BSS) methods based on the independent component analysis (ICA) have gained computational efficiency by the majorization-minimization (MM) principle with a price of losing flexibility. That is, replacing and comparing different source models are not easy in such MM-based framework because it requires efforts to derive a new algorithm each time when one changes the model. In this paper, a general framework for obtaining an ICA-based BSS algorithm is proposed so that a source model can easily be replaced because only a single line of the algorithm must be modified. A sparsity-based extension of the independent vector analysis and a low-rankness-based BSS model using the nuclear norm are also proposed to demonstrate the simplicity and easiness of the proposed framework.

*Phase corrected total variation for audio signals*

Yatabe, Kohei; Oikawa, Yasuhiro

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.656 - 6602018年04月-2018年04月

掲載種別：研究論文（国際会議プロシーディングス）

概要：In optimization-based signal processing, the so-called prior term models the desired signal, and therefore its design is the key factor to achieve a good performance. For audio signals, the time-directional total variation applied to a spectrogram in combination with phase correction has been proposed recently to model sinusoidal components of the signal. Although it is a promising prior, its applicability might be restricted to some extent because of the mismatch of the assumption to the signal. In this paper, based upon the previously proposed one, an improved prior for audio signals named instantaneous phase corrected total variation (iPCTV) is proposed. It can handle wider range of audio signals owing to the instantaneous phase correction term calculated from the observed signal.

*Envelope estimation by tangentially constrained spline*

Kusano, Tsubasa; Yatabe, Kohei; Oikawa, Yasuhiro

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.4374 - 43782018年04月-2018年04月

掲載種別：研究論文（国際会議プロシーディングス）

概要：Estimating envelope of a signal has various applications including empirical mode decomposition (EMD) in which the cubic C2-spline based envelope estimation is generally used. While such functional approach can easily control smoothness of an estimated envelope, the so-called undershoot problem often occurs that violates the basic requirement of envelope. In this paper, a tangentially constrained spline with tangential points optimization is proposed for avoiding the undershoot problem while maintaining smoothness. It is defined as a quartic C2-spline function constrained with first derivatives at tangential points that effectively avoids undershoot. The tangential points optimization method is proposed in combination with this spline to attain optimal smoothness of the estimated envelope.

*Realizing directional sound source in FDTD method by estimating initial value*

Takeuchi, Daiki; Yatabe, Kohei; Oikawa, Yasuhiro

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.461 - 4652018年04月-2018年04月

掲載種別：研究論文（国際会議プロシーディングス）

概要：Wave-based acoustic simulation methods are studied actively for predicting acoustical phenomena. Finite-difference time-domain (FDTD) method is one of the most popular methods owing to its straightforwardness of calculating an impulse response. In an FDTD simulation, an omnidirectional sound source is usually adopted, which is not realistic because the real sound sources often have specific directivities. However, there is very little research on imposing a directional sound source into FDTD methods. In this paper, a method of realizing a directional sound source in FDTD methods is proposed. It is formulated as an estimation problem of the initial value so that the estimated result corresponds to the desired directivity. The effectiveness of the proposed method is illustrated through some numerical experiments.

*Modal decomposition of musical instrument sound via alternating direction method of multipliers*

Masuyama, Yoshiki; Kusano, Tsubasa; Yatabe, Kohei; Oikawa, Yasuhiro

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.631 - 6352018年04月-2018年04月

掲載種別：研究論文（国際会議プロシーディングス）

概要：For a musical instrument sound containing partials, or modes, the behavior of modes around the attack time is particularly important. However, accurately decomposing it around the attack time is not an easy task, especially when the onset is sharp. This is because spectra of the modes are peaky while the sharp onsets need a broad one. In this paper, an optimization-based method of modal decomposition is proposed to achieve accurate decomposition around the attack time. The proposed method is formulated as a constrained optimization problem to enforce the perfect reconstruction property which is important for accurate decomposition. For optimization, the alternating direction method of multipliers (ADMM) is utilized, where the update of variables is calculated in closed form. The proposed method realizes accurate modal decomposition in the simulation and real piano sounds.

*Parametric approximation of piano sound based on Kautz model with sparse linear prediction*

Kobayashi, Kenji; Takeuchi, Daiki; Iwamoto, Mio; Yatabe, Kohei; Oikawa, Yasuhiro

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.626 - 6302018年04月-2018年04月

掲載種別：研究論文（国際会議プロシーディングス）

概要：The piano is one of the most popular and attractive musical instruments that leads to a lot of research on it. To synthesize the piano sound in a computer, many modeling methods have been proposed from full physical models to approximated models. The focus of this paper is on the latter, approximating piano sound by an IIR filter. For stably estimating parameters, the Kautz model is chosen as the filter structure. Then, the selection of poles and excitation signal rises as the questions which are typical to the Kautz model that must be solved. In this paper, sparsity based construction of the Kautz model is proposed for approximating piano sound.

*Seeing the sound we hear: optical technologies for visualizing sound wave*

Oikawa, Yasuhiro; Ishikawa, Kenji; Yatabe, Kohei; Onuma, Takashi; Niwa, Hayato

Proc. SPIE 10666, Three-Dimensional Imaging, Visualization, and Display 2018招待有り10666p.10666-1 - 10666-82018年04月-2018年04月

掲載種別：研究論文（国際会議プロシーディングス）

概要：Optical methods have been applied to visualize sound waves, and these have received a considerable amount of attention in both optical and acoustical communities. We have researched optical methods for sound imaging including laser Doppler vibrometry and Schlieren method. More recently, parallel phase-shifting interferometry with a high-speed polarization camera has been used, and it can take a slow-motion video of sound waves in the audible range. This presentation briefly reviews the recent progress in optical imaging of sound in air and introduces the applications including acoustic transducer testing and investigation of acoustic phenomena.

*Simultaneous imaging of flow and sound using high-speed parallel phase-shifting interferometry*

Ishikawa, Kenji; Tanigawa, Risako; Yatabe, Kohei; Oikawa, Yasuhiro; Onuma, Takashi; Niwa, Hayato

Optics Letters査読有り43(5)p.991 - 9942018年03月-2018年03月

掲載種別：研究論文（学術雑誌）ISSN：01469592

概要：© 2018 Optical Society of America. In this Letter, simultaneous imaging of flow and sound by using parallel phase-shifting interferometry and a high-speed polarization camera is proposed. The proposed method enables the visualization of flow and sound simultaneously by using the following two factors: (i) injection of the gas, whose density is different from the surrounding air, makes the flow visible to interferometry, and (ii) time-directional processing is applied for extracting the small-amplitude sound wave from the high-speed flow video. An experiment with a frame rate of 42,000 frames per second for visualizing the flow and sound emitted from a whistle was conducted. By applying time-directional processing to the obtained video, both flow emitted from the slit of th e whistle and a spherical sound wave of 8.7 kHz were successively captured.

*Infinite-dimensional SVD for revealing microphone array's characteristics*

Koyano, Yuji; Yatabe, Kohei; Oikawa, Yasuhiro

Applied Acoustics査読有り129p.116 - 1252018年01月-2018年01月

掲載種別：研究論文（学術雑誌）ISSN：0003-682X

概要：Nowadays, many acoustical applications utilize microphone arrays whose configurations have a lot of varieties including linear, planar, spherical and random arrays. Arguably, some configurations are better than the others in terms of acquiring the spatial information of a sound field (for example, a spherical array can distinguish any direction of arrival, while a linear array cannot distinguish the direction perpendicular to its aperture direction due to the rotational symmetry). However, it is not easy to compare arrays of different configurations because each array has been treated by a specific theory depending on the configuration of the array. Although several criteria have been proposed for evaluating and/or designing the arrays, most of them are application-oriented criteria, and the best configuration for some criterion may not be a better one for the other criterion. Therefore, an analysis method for microphone arrays which does not depend on the array configuration or application is necessary. In this paper, the infinite-dimensional SVD is proposed for analyzing and comparing the properties of arrays. The singular values, functions and vectors obtained by the proposed method provide the fundamental properties of an array.

*Simultaneous visualization of flow and sound using parallel phase-shifting interferometry*

Tanigawa, Risako; Ishikawa, Kenji; Yatabe, Kohei; Oikawa, Yasuhiro; Onuma, Takashi; Niwa, Hayato

11th Pacific Symposium on Flow Visualization and Image Processing (PSFVIP)p.031-1 - 031-42017年12月-2017年12月

掲載種別：研究論文（国際会議プロシーディングス）

概要：Aerodynamic sound generated by ﬂow is one of the causes of noise; thus its prediction and reduction are essential. Optical visualization techniques are effective to understand the sound generation process caused by ﬂow. In order to understand the generation process of aerodynamic sound, simultaneous visualization of ﬂow and sound is necessary. However, simultaneous optical visualization of ﬂow and sound has not been accomplished because the transient sound ﬁeld measurement has just recently been realized. This paper aims to simultaneously visualize both ﬂow and sound. In order to realize this purpose, parallel phase-shifting interferometry (PPSI) which can measure transient ﬁeld is suitable. As a basic experiment of simultaneous visualization using PPSI, the ﬂow and sound ﬁeld around a whistle were visualized.

*Hyper ellipse fitting in subspace method for phase-shifting interferometry: Practical implementation with automatic pixel selection*

Yatabe, Kohei; Ishikawa, Kenji; Oikawa, Yasuhiro

Optics Express査読有り25(23)p.29401 - 294162017年11月-2017年11月

掲載種別：研究論文（学術雑誌）

概要：© 2017 Optical Society of America. This paper presents a method of significantly improving the previously proposed simple, flexible, and accurate phase retrieval algorithm for the random phase-shifting interferometry named HEFS [K. Yatabe, J. Opt. Soc. Am. A 34, 87 (2017)]. Although its e ectiveness and performance were confirmed by numerical experiments in the original paper, it is found that the algorithm may not work properly if observed fringe images contains untrusted (extremely noisy) pixels. In this paper, a method of avoiding such untrusted pixels within the estimation processes of HEFS is proposed for the practical use of the algorithm. In addition to the proposal, an experiment of measuring a sound field in air was conducted to show the performance for real data, where the proposed improvement is critical for that situation. MATLAB codes (which can be downloaded from http://goo.gl/upcsFe) are provided within the paper to aid understanding the main concept of the proposed methods.

*Experimental visualization of flow-induced sound using high-speed polarization interferometer*

Ishikawa, Kenji; Tanigawa, Risako; Yatabe, Kohei; Oikawa, Yasuhiro; Onuma, Takashi; Niwa, Hayato

14th International Conference on Flow Dynamics (ICFD)p.746 - 7472017年11月-2017年11月

掲載種別：研究論文（国際会議プロシーディングス）

概要：An experimental method to visualize a sound ﬁeld generated by ﬂow is presented. The high-speed polarization interferometer is used to detect the change in refractive index caused by the sound wave. This paper demonstrates the visualization of acoustic resonance induced by air ﬂow inside a rectangular cavity with a circular oriﬁce. Two acoustic resonance modes, namely Helmholtz resonance and duct resonance, were successfully captured by the experiment. The proposed method should be eﬀectively used to understand the physics of the interaction between ﬂow and sound.

*Visualization of 3D sound field using see-through head mounted display*

Inoue, Atsuto; Yatabe, Kohei; Oikawa, Yasuhiro; Ikeda, Yusuke

Proceedings of SIGGRAPH ’17 Posters査読有りp.34-1 - 34-22017年07月-2017年07月

掲載種別：研究論文（国際会議プロシーディングス）

概要：© 2017 Copyright held by the owner/author(s). We propose a visualization system of three-dimensional (3D) sound information using video and optical see-through head mounted displays (ST-HMDs). The Mixed Reality (MR) displays enable intuitive understanding of 3D information of a sound field which is quite difficult to project onto an ordinary two-dimensional (2D) display in an easily understandable way. As examples of the visualization, the sound intensity (a stationary vector field representing the energy flow of sound) around a speaker and a motor engine is shown.

*Least-squares estimation of sound source directivity using convex selector of a better solution*

Tamura, Yuki; Yatabe, Kohei; Oikawa, Yasuhiro

Acoustical Science and Technology査読有り38(3)p.128 - 1362017年05月-2017年05月

掲載種別：研究論文（学術雑誌）ISSN：13463969

概要：Many acoustical simulation methods have been studied to investigate acoustical phenomena. Modeling of the directivity pattern of a sound source is also important for obtaining realistic simulation results. However, there has been little research on this. Although there has been research on sound source identification, the results might not be in a suitable form for numerical simulation. In this paper, a method for modeling a sound source from measured data is proposed. It utilizes the sum of monopoles as the physical model, and the modeling is achieved by estimating the model parameters. The estimation method is formulated as a convex optimization problem by assuming the smoothness of a solution and the sparseness of parameters. Moreover, an algorithm based on the alternating direction method of multipliers (ADMM) for solving the problem is derived. The validity of the method is evaluated using simulated data, and the modeling result for an actual loudspeaker is shown.

*Acousto-optic back-projection: Physical-model-based sound field reconstruction from optical projections*

Yatabe, Kohei; Ishikawa, Kenji; Oikawa, Yasuhiro

Journal of Sound and Vibration査読有り394p.171 - 1842017年04月-2017年04月

掲載種別：研究論文（学術雑誌）ISSN：0022460X

概要：© 2017 Elsevier LtdAs an alternative to microphones, optical techniques have been studied for measuring a sound field. They enable contactless and non-invasive acoustical observation by detecting density variation of medium caused by sound. Although they have important advantages comparing to microphones, they also have some disadvantages. Since sound affects light at every points on the optical path, the optical methods observe an acoustical quantity as spatial integration. Therefore, point-wise information of a sound field cannot be obtained directly. Ordinarily, the computed tomography (CT) method has been applied for reconstructing a sound field from optically measured data. However, the observation process of the optical methods have not been considered explicitly, which limits the accuracy of the reconstruction. In this paper, a physical-model-based sound field reconstruction method is proposed. It explicitly formulates the physical observation process so that a model mismatch of the conventional methods is eliminated.

*Infinite-dimensional SVD for analyzing microphone array*

Koyano, Yuji; Yatabe, Kohei; Oikawa, Yasuhiro

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.176 - 1802017年03月-2017年03月

掲載種別：研究論文（国際会議プロシーディングス）ISSN：15206149

概要：Nowadays, various types of microphone array are used in many applications. However, it is not easy to compare arrays of different types because each array has been treated by a speciﬁc theory depending on the type of an array. Although several criteria have been proposed for microphone arrays for evaluating and/or designing an array, most of them are application-oriented criteria and the best conﬁguration for some criterion may not be a better one in the other criterion. Therefore, an analysis and comparing method for microphone arrays which does not depend on an array conﬁguration and application are necessary. In this paper, inﬁnite-dimensional SVD is proposed for analyzing and comparing properties of arrays. The singular values and functions obtained by proposed method show sampling property of an array and can be uniﬁed criterion.

*Coherence-adjusted monopole dictionary and convex clustering for 3D localization of mixed near-field and far-field sources*

Tachikawa, Tomoya; Yatabe, Kohei; Oikawa, Yasuhiro

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.3191 - 31952017年03月-2017年03月

掲載種別：研究論文（国際会議プロシーディングス）ISSN：15206149

概要：In this paper, 3D sound source localization method for simultaneously estimating both direction-of-arrival (DOA) and distance from the microphone array is proposed. For estimating distance, the off-grid problem must be overcome because the range of distance to be considered is quite broad and even not bounded. The proposed method estimates positions based on an extension of the convex clustering method combined with sparse coefﬁcients estimation. A method for constructing a suitable monopole dictionary based on coherence is also proposed so that the convex clustering based method appropriately estimate distance of sound sources. Numerical experiments of distance estimation and 3D localization show possibility of the proposed method.

*Spatio-temporal filter bank for visualizing audible sound field by Schlieren method*

Chitanont, Nachanant; Yatabe, Kohei; Ishikawa, Kenji; Oikawa, Yasuhiro

Applied Acoustics査読有り115p.109 - 1202017年01月-2017年01月

掲載種別：研究論文（学術雑誌）ISSN：0003682X

概要：© 2016 Elsevier LtdVisualization of sound field using optical techniques is a powerful tool for understanding acoustical behaviors. It uses light waves to examine the acoustical quantities without disturbing the sound information of the field under investigation. Schlieren imaging is an optical method that uses a camera to visualize the density of transparent media. As it uses a single shot to capture the information without scanning, it can observe both reproducible and non-reproducible sound field. Conventionally, the Schlieren system is applied to high-pressure ultrasound and shock waves. However, since the density variation of air caused by the audible sound field is very small, this method was not applicable for visualizing these fields. In this paper, a spatio-temporal filter bank is proposed to overcome this problem. As the sound is a very specific signal, the spatio-temporal spectrum (in two-dimensional space and time) of the audible sound is concentrated in a specific region. The spatio-temporal filter bank is designed for extracting the sound field information in the specific region and removing noise. The results indicate that the visibility of the sound fields is enhanced by using the proposed method.

*Simple, flexible, and accurate phase retrieval method for generalized phase-shifting interferometry*

Yatabe, Kohei; Ishikawa, Kenji; Oikawa, Yasuhiro

Journal of the Optical Society of America A: Optics and Image Science, and Vision査読有り34(1)p.87 - 962017年01月-2017年01月

掲載種別：研究論文（学術雑誌）ISSN：10847529

概要：© 2016 Optical Society of America.This paper presents a non-iterative phase retrieval method from randomly phase-shifted fringe images. By combining the hyperaccurate least squares ellipse fitting method with the subspace method (usually called the principal component analysis), a fast and accurate phase retrieval algorithm is realized. The proposed method is simple, flexible, and accurate. It can be easily coded without iteration, initial guess, or tuning parameter. Its flexibility comes from the fact that totally random phase-shifting steps and any number of fringe images greater than two are acceptable without any specific treatment. Finally, it is accurate because the hyperaccurate least squares method and the modified subspace method enable phase retrieval with a small error as shown by the simulations. A MATLAB code, which is used in the experimental section, is provided within the paper to demonstrate its simplicity and easiness.

*Interferometric imaging of acoustical phenomena using high-speed polarization camera and 4-step parallel phase-shifting technique*

Ishikawa, Kenji; Yatabe, Kohei; Ikeda, Yusuke; Oikawa, Yasuhiro; Onuma, Takashi; Niwa, Hayato; Yoshii, Minoru

Proceedings of SPIE - The International Society for Optical Engineering査読有り103282017年01月-2017年01月

掲載種別：研究論文（国際会議プロシーディングス）ISSN：0277786X

概要：© 2017 SPIE.Imaging of sound aids the understanding of the acoustical phenomena such as propagation, reflection, and diffraction, which is strongly required for various acoustical applications. The imaging of sound is commonly done by using a microphone array, whereas optical methods have recently been interested due to its contactless nature. The optical measurement of sound utilizes the phase modulation of light caused by sound. Since light propagated through a sound field changes its phase as proportional to the sound pressure, optical phase measurement technique can be used for the sound measurement. Several methods including laser Doppler vibrometry and Schlieren method have been proposed for that purpose. However, the sensitivities of the methods become lower as a frequency of sound decreases. In contrast, since the sensitivities of the phase-shifting technique do not depend on the frequencies of sounds, that technique is suitable for the imaging of sounds in the low-frequency range. The principle of imaging of sound using parallel phase-shifting interferometry was reported by the authors (K. Ishikawa et al., Optics Express, 2016). The measurement system consists of a high-speed polarization camera made by Photron Ltd., and a polarization interferometer. This paper reviews the principle briefly and demonstrates the high-speed imaging of acoustical phenomena. The results suggest that the proposed system can be applied to various industrial problems in acoustical engineering.

*Signal processing for optical sound field measurement and visualization*

Yatabe, Kohei; Ishikawa, Kenji; Oikawa, Yasuhiro

Proceedings of Meetings on Acoustics招待有り29(1)p.020010-1 - 020010-82016年11月-2016年11月

掲載種別：研究論文（国際会議プロシーディングス）ISSN：1939800X

概要：© 2017 Acoustical Society of America. Accurately measuring sound pressure is not an easy task because every microphone has its own mechanical and electrical characteristics. Moreover, the existence of a measuring instrument inside the field causes reflection and diffraction which deform the wavefront of sound to be measured. Ideally, a sensing device should not have any characteristic nor exist inside a measuring region. Although it may sound unrealistic, optical measurement methods are able to realize such ideal situation. Optical devices can be placed outside the sound field, and some of the sensing techniques, which decode information of sound from the phase of light, are able to cancel optical and electrical characteristics. Thus, optical sound measurement methods have possibility of achieving higher accuracy than ordinary sound measurement in principle. However, they have two main drawbacks that have prevented their applications in acoustics: (1) point-wise information cannot be obtained directly because observed signal is spatially integrated along the optical path; and (2) increasing signal-to-noise ratio is difficult because optical measurement of less than a nanometer order is typically required. To overcome the above difficulties, we have proposed several signal processing methods. In this paper, those methods are introduced with the physical principle of optical sound measurement.

*Three-dimensional sound-field visualization system using head mounted display and stereo camera*

Inoue, Atsuto; Ikeda, Yusuke; Yatabe, Kohei; Oikawa, Yasuhiro

Proceedings of Meetings on Acoustics29(1)p.025001-1 - 025001-132016年11月-2016年11月

掲載種別：研究論文（国際会議プロシーディングス）ISSN：1939800X

概要：© 2017 Acoustical Society of America. Visualization of a sound field helps us to intuitively understand various acoustic phenomena in sound design and education. The most straightforward method is to overlap the measured data onto a photographic image. However, in order to understand an entire three-dimensional (3D) sound field by using a conventional two-dimensional screen, it is necessary to move a camera and measure repeatedly. On the other hand, the augmented reality (AR) techniques such as an video see-through head mounted display (VST-HMD) have been rapidly developed. In this study, we propose a sound field visualization system using an VST-HMD and a hand held four-point microphone. This system calculates sound intensity from the four sound signals in real time. Then, the sound intensity distribution is depicted as arrows in the 3D display. The position and angle of the microphones and users head are acquired via AR markers and head tracking sensors of the VST-HMD. The system realizes simple and effective visualization of 3D sound field information from the various directions and positions of view. For the experiments, the sound fields generated by loudspeakers and motorcycles were visualized. The results suggested that the proposed system can present information of the field in easily recognizable manner.

*Optical sensing of sound fields: non-contact, quantitative, and single-shot imaging of sound using high-speed polarization camera*

Ishikawa, Kenji; Yatabe, Kohei; Ikeda, Yusuke; Oikawa, Yasuhiro; Onuma, Takashi; Niwa, Hayato; Yoshii, Minoru

Proceedings of Meetings on Acoustics招待有り29(1)p.030005-1 - 030005-82016年11月-2016年11月

掲載種別：研究論文（国際会議プロシーディングス）ISSN：1939800X

概要：© 2017 Acoustical Society of America. Imaging of a sound field aids understanding of the actual behavior of the field. That is useful for obtaining acoustical spatial characteristics of transducers, materials and noise sources. For high spatial resolution imaging, optical measurement methods have been used thanks to its contactless nature. This paper presents sound field imaging method based on parallel phase-shifting interferometry, which enables to acquire an instantaneous two-dimensional phase distribution of light. Information of sound field is calculated from the phase of light based on the acousto-optic theory. The system consists of a polarization interferometer and high-speed polarization camera. The number of the measurement points in a single image are 512 × 512 and the interval between adjacent pixels is 0.22 mm. Therefore, the system can image a sound field with much higher spatial resolution compared with conventional imaging methods such as microphone arrays. The maximum frame rate, which is corresponding to the sampling frequency, is 1.55 M frames per second. This paper contains the principle of optical measurement of sound, the description of the system, and several experimental results including imaging of sound fields generated by transducers and reflection of the sound waves.

*Sound source localization based on sparse estimation and convex clustering*

Tachikawa, Tomoya; Yatabe, Kohei; Ikeda, Yusuke; Oikawa, Yasuhiro

Proceedings of Meetings on Acoustics29(1)p.055004-1 - 055004-142016年11月-2016年11月

掲載種別：研究論文（国際会議プロシーディングス）ISSN：1939800X

概要：Sound source localization techniques using microphones have been the subject of much interest for many years. Many of them assume far-field sources, and plane waves are used as a dictionary for estimating the direction-of-arrival (DOA) of sound sources. On the other hand, there has been less research on 3D source localization which estimates both direction and distance. In case of estimating distances, monopoles must be used as a dictionary. By setting monopoles in far-field, their waves can be regarded as plane waves, and their distance can be estimated.However, monopoles set at many positions can be impossible due to high computational cost. Moreover, the grid discretization can cause estimation error because there are a lot of the number of grid points in 3D space. Such discretization issue is called off-grid problem. Therefore, a source localization with monopole-only dictionary needs some methods to solve the off-grid problem.The proposed method uses sparse estimation and modified convex clustering with a monopole-only dictionary. Sparse estimation selects the monopoles which are candidates of the source positions. Then, modified convex clustering solves the off-grid problem, and estimates source positions. In this paper, simulation and comparison with another method show effectiveness of the proposed method.

*Improving principal component analysis based phase extraction method for phase-shifting interferometry by integrating spatial information*

Yatabe, Kohei; Ishikawa, Kenji; Oikawa, Yasuhiro

Optics Express査読有り24(20)p.22881 - 228912016年10月-2016年10月

掲載種別：研究論文（学術雑誌）

概要：© 2016 Optical Society of America.Phase extraction methods based on the principal component analysis (PCA) can extract objective phase from phase-shifted fringes without any prior knowledge about their shift steps. Although it is fast and easy to implement, many fringe images are needed for extracting the phase accurately from noisy fringes. In this paper, a simple extension of the PCA method for reducing extraction error is proposed. It can effectively reduce influence from random noise, while most of the advantages of the PCA method is inherited because it only modifies the construction process of the data matrix from fringes. Although it takes more time because size of the data matrix to be decomposed is larger, computational time of the proposed method is shown to be reasonably fast by using the iterative singular value decomposition algorithm. Numerical experiments confirmed that the proposed method can reduce extraction error even when the number of interferograms is small.

*Compensation of fringe distortion for phaseshifting three-dimensional shape measurement by inverse map estimation*

Yatabe, Kohei; Ishikawa, Kenji; Oikawa, Yasuhiro

Applied Optics査読有り55(22)p.6017 - 60242016年08月-2016年08月

掲載種別：研究論文（学術雑誌）ISSN：1559128X

概要：© 2016 Optical Society of America.For three-dimensional shape measurement, phase-shifting techniques are widely used to recover the objective phase containing height information from images of projected fringes. Although such techniques can provide an accurate result in theory, there might be considerable error in practice. One main cause of such an error is distortion of fringes due to nonlinear responses of a measurement system. In this paper, a postprocessing method for compensating distortion is proposed. Compared to other compensation methods, the proposed method is flexible in two senses: (1) no specific model of nonlinearity (such as the gamma model) is needed, and (2) no special calibration data are needed (only the observed image of the fringe is required). Experiments using simulated and real data confirmed that the proposed method can compensate multiple types of nonlinearity without being concerned about the model.

*Optical sound field measurement and imaging using laser and high-speed camera*

Oikawa, Yasuhiro; Yatabe, Kohei; Ishikawa, Kenji; Ikeda, Yusuke

Proceedings of the INTER-NOISE 2016 - 45th International Congress and Exposition on Noise Control Engineering: Towards a Quieter Future招待有りp.258 - 2662016年08月-2016年08月

掲載種別：研究論文（国際会議プロシーディングス）

概要：© 2016, German Acoustical Society (DEGA). All rights reserved.Optical sound measurement, which acquires acoustical quantities by means of optical techniques, is of growing interest as an alternative method for the sound field imaging. There are two remarkable aspects of the optical sound measurement. The first is non-intrusiveness. Since the measurement is achieved by observing the light passed through the sound field, the instruments can be arranged outside the measurement field; non-contact and non-destructive measurement can be achieved. The second one is spatial resolution. Instead of building an array, expanding or scanning of light are often used for the optical imaging. Therefore, the optical imaging does not have the limitation of interval of measurement points due to the size of the instruments as microphone arrays. These two aspects make the optical method possible to image sound field with high spatial resolution and without any disturbance to the original field. In this paper, we show several methods for the optical sound imaging. Laser Doppler vibrometer can be developed as the imaging methods by scanning a narrow light beam. The two dimensional of transient field measurement can be achieved by using a high-speed camera because of single-shot. In addition some signal processing techniques introduced to optical measurement are also described.

*Convex optimization-based windowed Fourier filtering with multiple windows for wrapped-phase denoising*

Yatabe, Kohei; Oikawa, Yasuhiro

Applied Optics査読有り55(17)p.4632 - 46412016年06月-2016年06月

掲載種別：研究論文（学術雑誌）ISSN：1559128X

概要：© 2016 Optical Society of America.The windowed Fourier filtering (WFF), defined as a thresholding operation in the windowed Fourier transform (WFT) domain, is a successful method for denoising a phase map and analyzing a fringe pattern. However, it has some shortcomings, such as extremely high redundancy, which results in high computational cost, and difficulty in selecting an appropriate window size. In this paper, an extension of WFF for denoising a wrapped-phase map is proposed. It is formulated as a convex optimization problem using Gabor frames instead of WFT. Two Gabor frames with differently sized windows are used simultaneously so that the above-mentioned issues are resolved. In addition, a differential operator is combined with a Gabor frame in order to preserve discontinuity of the underlying phase map better. Some numerical experiments demonstrate that the proposed method is able to reconstruct a wrapped-phase map, even for a severely contaminated situation.

*High-speed imaging of sound using parallel phase-shifting interferometry*

Ishikawa, Kenji; Yatabe, Kohei; Chitanont, Nachanant; Ikeda, Yusuke; Oikawa, Yasuhiro; Onuma, Takashi; Niwa, Hayato; Yoshii, Minoru

Optics Express査読有り24(12)p.12922 - 129322016年06月-2016年06月

掲載種別：研究論文（学術雑誌）

概要：© 2016 Optical Society of America.Sound-field imaging, the visualization of spatial and temporal distribution of acoustical properties such as sound pressure, is useful for understanding acoustical phenomena. This study investigated the use of parallel phase-shifting interferometry (PPSI) with a high-speed polarization camera for imaging a sound field, particularly high-speed imaging of propagating sound waves. The experimental results showed that the instantaneous sound field, which was generated by ultrasonic transducers driven by a pure tone of 40 kHz, was quantitatively imaged. Hence, PPSI can be used in acoustical applications requiring spatial information of sound pressure.

*Physical-model based efficient data representation for many-channel microphone array*

Koyano, Yuji; Yatabe, Kohei; Ikeda, Yusuke; Oikawa, Yasuhiro

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有り2016-Mayp.370 - 3742016年05月-2016年05月

掲載種別：研究論文（国際会議プロシーディングス）ISSN：15206149

概要：© 2016 IEEE.Recent development of microphone arrays which consist of more than several tens or hundreds microphones enables acquisition of rich spatial information of sound. Although such information possibly improve performance of any array signal processing technique, the amount of data will increase as the number of microphones increases; for instance, a 1024 ch MEMS microphone array, as in Fig. 1, generates data more than 10 GB per minute. In this paper, a method constructing an orthogonal basis for efficient representation of sound data obtained by the microphone array is proposed. The proposed method can obtain a basis for arrays with any configuration including rectangle, spherical, and random microphone array. It can also be utilized for designing a microphone array because it offers a quantitative measure for comparing several array configurations.

*Eigenanalysis of lp-norm ball-shaped room using the method of particular solutions*

Yatabe, Kohei; Oikawa, Yasuhiro

12th Western Pacific Acoustics Conference (WESPAC 2015)p.88 - 922015年12月-2015年12月

掲載種別：研究論文（国際会議プロシーディングス）

*Numerical analysis of acousto-optic effect caused by audible sound based on geometrical optics*

Ishikawa, Kenji; Yatabe, Kohei; Ikeda, Yusuke; Oikawa, Yasuhiro

12th Western Pacific Acoustics Conference (WESPAC 2015)p.165 - 1692015年12月-2015年12月

掲載種別：研究論文（国際会議プロシーディングス）

*Audible sound field visualization by using Schlieren technique*

Chitanont, Nachanant; Yatabe, Kohei; Oikawa, Yasuhiro

12th Western Pacific Acoustics Conference (WESPAC 2015)p.5 - 92015年12月-2015年12月

掲載種別：研究論文（国際会議プロシーディングス）

*Extracting sound information from high-speed video using 3-D shape measurement method*

Yamanaka, Yusei; Yatabe, Kohei; Nakamura, Ayumi; Ikeda, Yusuke; Oikawa, Yasuhiro

12th Western Pacific Acoustics Conference (WESPAC 2015)p.30 - 342015年12月-2015年12月

掲載種別：研究論文（国際会議プロシーディングス）

*Calculation of impulse response by using the method of fundamental solutions*

Suzuki, Naoko; Yatabe, Kohei; Oikawa, Yasuhiro

12th Western Pacific Acoustics Conference (WESPAC 2015)p.388 - 3922015年12月-2015年12月

掲載種別：研究論文（国際会議プロシーディングス）

*Modeling of free-reed instrument considering mechanical nonlinearity of the reed*

Nakamura, Ayumi; Yamanaka, Yusei; Yatabe, Kohei; Oikawa, Yasuhiro

12th Western Pacific Acoustics Conference (WESPAC 2015)p.206 - 2092015年12月-2015年12月

掲載種別：研究論文（国際会議プロシーディングス）

*スパース表現に基づく音場の復元と光学的音響測定データへの応用*

矢田部浩平, 及川靖広

日本音響学会誌招待有り71(11)p.639 - 6462015年11月-2015年11月

掲載種別：研究論文（学術雑誌）

*Optically visualized sound field reconstruction using Kirchhoff-Helmholtz equation*

Yatabe, Kohei; Oikawa, Yasuhiro

Acoustical Science and Technology査読有り36(4)p.351 - 3542015年01月-2015年01月

掲載種別：研究論文（学術雑誌）ISSN：13463969

概要：A method for reconstructing a measured sound field using the Kirchhoff-Helmholtz boundary integral equation is presented. L4-norm ball shape was chosen for the boundary, which is smooth and more suitable for data measured along the rectangular coordinate than the circle. ADMM (alternating direction method of multipliers) was employed to solve the initial value estimation via sparsity and the norm ball constrained least squares problem. The experiment using synthetic data confirmed the effectiveness of the method.

*Redundant representation of acoustic signals using curvelet transform and its application to speech denoising*

Chiba, Mariko; Yatabe, Kohei; Oikawa, Yasuhiro

Acoustical Science and Technology査読有り36(5)p.457 - 4582015年01月-2015年01月

掲載種別：研究論文（学術雑誌）ISSN：13463969

概要：A study is conducted to propose a sparse representation method for acoustic signals by the use of curvelets, and confirm its efficacy through an example of speech denoising. The curvelet denoising method was applied using the hard thresholding to speech, to demonstrate the effectiveness of the proposed representation. The method was applied to a female speech with additive pink noise for conducting the investigations. The level of the noise was chosen so that signal-to-noise ratio (SNR) became 10 dB. The thresholding process was iterated 10 times. The same input signal was processed by two denoising Methods for comparison.

*Optically visualized sound field reconstruction based on sparse selection of point sound sources*

Yatabe, Kohei; Oikawa, Yasuhiro

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有り2015-Augustp.504 - 5082015年01月-2015年01月

掲載種別：研究論文（国際会議プロシーディングス）ISSN：15206149

概要：© 2015 IEEE.Visualization is an effective way to understand the behavior of a sound field. There are several methods for such observation including optical measurement technique which enables a non-destructive acoustical observation by detecting density variation of the medium. For audible sound propagating through the air, however, smallness of the variation requires high sensitivity of the measuring system that causes problematic noise contamination. In this paper, a method for reconstructing two-dimensional audible sound fields from noisy optical observation is proposed.

*Visualization of sound field by means of Schlieren method with spatio-temporal filtering*

Chitanont, Nachanant; Yaginuma, Keita; Yatabe, Kohei; Oikawa, Yasuhiro

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有り2015-Augustp.509 - 5132015年01月-2015年01月

掲載種別：研究論文（国際会議プロシーディングス）ISSN：15206149

概要：© 2015 IEEE.Visualization of sound field using Schlieren technique provides many advantages. It enables us to investigate the change of the sound field in real-time from every point of the observing region. However, since the density gradient of air caused by the disturbance of acoustic field is very small, it is difficult to observe the audible sound field from the raw Schlieren video. In this paper, to enhance visibility of the audible sound fields from the Schlieren videos, we propose to use spatio-temporal filters for extracting sound information and for noise removal. We have utilized different filtering techniques such as the FIR bandpass filter, the Gaussian filter, the Wiener filter and the 3D Gabor filter, to do this. The results indicate that the data observed after using these signal processing methods are clearer than the raw Schlieren videos.

*レーザドプラ振動計を用いた音場測定への境界要素法の逆解析の導入*

矢田部浩平, 及川靖広

電子情報通信学会論文誌A査読有りJ97-A(2)p.104 - 1112014年02月-2014年02月

掲載種別：研究論文（学術雑誌）ISSN：1881-0195

*PDE-based interpolation method for optically visualized sound field*

Yatabe, Kohei; Oikawa, Yasuhiro

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings査読有りp.4738 - 47422014年01月-2014年01月

掲載種別：研究論文（国際会議プロシーディングス）ISSN：15206149

概要：An effective way to understand the behavior of a sound field is to visualize it. An optical measurement method is a suitable option for this as it enables contactless non-destructive measurement. After measuring a sound field, interpolation of the data is necessary for a smooth visualization. However, conventional interpolation methods cannot provide a physically meaningful result especially when the condition of the measurement causes moiré effect. In this paper, a special interpolation method for an optically visualized sound field based on the Kirchhoff-Helmholtz integral equation is proposed. © 2014 IEEE.

*Acoustic Yagi-Uda antenna using resonance tubes*

Tamura, Yuki; Yatabe, Kohei; Ouchi, Yasuhiro; Oikawa, Yasuhiro; Yamasaki, Yoshio

INTERNOISE 2014 - 43rd International Congress on Noise Control Engineering: Improving the World Through Noise Control2014年01月-2014年01月

掲載種別：研究論文（国際会議プロシーディングス）

概要：A Yagi-Uda antenna gets high directivity by applying current phase shift between elements due to resonance phenomena. It has some directors and reflectors, which are elements without electric supply. The length of directors is shorter than half-wave and that of reflectors is longer than half-wave. We proposed an acoustic Yagi-Uda antenna which elements are resonance tubes and a loudspeaker. The aim of this research is to im- prove directivity in a specific frequency. This can be applied to Radio Acoustic Sounding System (RASS), which is a kind of radar for weather observation, or to a parametric loudspeaker. The phase shift of sound waves was observed in the condition with a resonance tube and without the tube at the same position. That shift changes suddenly around the resonance frequency of the tube. Our acoustic antenna has resonance tubes that have different length as directors and reflectors to apply this phenomenon. Moreover, the distances be- Tween a loudspeaker and tubes were concerned by some experiments and by numerical analysis. The acoustic antenna showed directivity in an appropriate condition of the distances and the frequency of the sound source. It will be also added the consideration about the effective frequency band of this acoustic antenna.

*Wind noise reduction using empirical mode decomposition*

Yatabe, Kohei; Oikawa, Yasuhiro

Proceedings of Meetings on Acoustics192013年06月-2013年06月

掲載種別：研究論文（国際会議プロシーディングス）ISSN：1939800X

概要：One common problem of outdoor recordings is a contamination of wind noise which has highly non-stationary characteristics. Although there are a lot of noise reduction methods which work well for general kinds of noises, most methods perform worse for wind noise due to its non-stationary nature. Therefore, wind noise reduction need specific technique to overcome this non-stationary. Empirical mode decomposition (EMD) is a relatively new method to decompose a signal into several data-driven bases which are modeled as amplitude and frequency modulated sinusoids that represent wind noise better than quasi-stationary analysis methods such as short-time Fourier transform since it assumes an analyzing signal as non-stationary. Thus, EMD has a potential to reduce wind noise from recorded sounds in an entirely different way from ordinary methods. In this paper, the method to apply EMD as a wind noise suppressor is presented. The experiment is performed on a female speech superimposed with wind noise, and the results showed its possibility. © 2013 Acoustical Society of America.

## 書籍等出版物

## 特許

整理番号：2021

*ステレオ信号生成装置、電子楽器、ステレオ信号生成方法、プログラム*（日本）

及川 靖広, 矢田部 浩平, 大木 大夢, 小林 憲治, 竹内 大起

特願2018- 42776、特開2019-161343

整理番号：2022

*共鳴信号生成装置、電子楽器、共鳴信号生成方法*（日本）

及川 靖広, 矢田部 浩平, 小林 憲治

特願2018- 42777、特開2019-158995

整理番号：2023

*モード分解装置、モード分解方法、プログラム*（日本）

及川 靖広, 矢田部 浩平, 草野 翼, 升山 義紀

特願2018- 43193、特開2019-159018

整理番号：2140

*推定装置、その方法、およびプログラム*（日本）

矢田部 浩平, 升山 義紀

特願2019- 14052、特開2020-122855

整理番号：2141

*時間周波数マスク推定器学習装置、時間周波数マスク推定器学習方法、プログラム*（日本）

矢田部 浩平

特願2019-015065、特開2020-122896

整理番号：2142

*回帰関数学習装置、回帰関数学習方法、プログラム*（日本）

矢田部 浩平, 竹内 大起

特願2019-015066、特開2020-122897

整理番号：2229

*位相推定装置、位相推定方法、およびプログラム*（日本）

矢田部 浩平, 升山 義紀

特願2019-135981

## 外部研究資金

### 科学研究費採択状況

研究種別：

*音の光干渉計測におけるデータ解析手法の研究*

2017年-0月-2019年-0月

配分額：￥2990000

研究種別：

*光学干渉計による高速度映像からの音響情報抽出に関する研究*

2019年-0月-2023年-0月

配分額：￥4160000

研究種別：

*光学的音響計測による音源近接空間のセンシングと音源の記述*

2020年-0月-2024年-0月

配分額：￥17940000

## 学内研究制度

### 特定課題研究

*凸最適化に基づく音響信号処理の研究*

2017年度

研究成果概要： 信号処理では，解くべき問題を最適化問題に帰着させ，何らかのアルゴリズムによって最適化することで，データに対して所望の処理を行うことが多い．最適化問題は，非凸な問題と凸な問題に大別でき，凸なら大域最適性を保証できるなど，凸最適化問... 信号処理では，解くべき問題を最適化問題に帰着させ，何らかのアルゴリズムによって最適化することで，データに対して所望の処理を行うことが多い．最適化問題は，非凸な問題と凸な問題に大別でき，凸なら大域最適性を保証できるなど，凸最適化問題は非凸な問題に比べて性質の良い問題と言える．本研究では，音響信号処理の諸問題を凸最適化問題として定式化し，凸最適化アルゴリズムを用いて解くことで，様々な処理を実現した．具体的には，混合音から混合元の音源を推定する音源分離，観測信号から雑音を取り除くノイズ除去，音を調波成分に分けるモード分解，音響シミュレーションにおける初期条件の推定，実信号の包絡推定等を提案した．

*近接分離最適化による音響信号処理*

2018年度

研究成果概要： 信号処理では，解決すべき課題を最適化問題に帰着させ，何らかのアルゴリズムによって解くことで，データに対して所望の処理を行うことが多い．最適化問題は，複数の項が含まれると解くのが難しくなるが，近年の近接分離最適化手法は各項を分離し... 信号処理では，解決すべき課題を最適化問題に帰着させ，何らかのアルゴリズムによって解くことで，データに対して所望の処理を行うことが多い．最適化問題は，複数の項が含まれると解くのが難しくなるが，近年の近接分離最適化手法は各項を分離して各々解くのみで良く，複雑な問題を解くのに適している．本研究では，音響における諸問題を近接分離アルゴリズムを用いて解くことで，様々な音響信号処理を実現した．具体的には，混合音から混合元の音源を推定する音源分離，観測信号から雑音を取り除くノイズ除去，音の鳴っている位置を推定する音源定位，振幅スペクトログラムから位相を生成する位相復元などを提案した．

*近接分離法のヒューリスティック拡張に基づく音響信号処理*

2019年度

研究成果概要： 信号処理では，解決すべき課題を最適化問題に帰着させ何らかのアルゴリズムによって解くことで，データに対して所望の処理を行うことが多い．最適化問題は複数の項が含まれると解くのが難しくなるが，近年の近接分離最適化手法は各項を分離して各... 信号処理では，解決すべき課題を最適化問題に帰着させ何らかのアルゴリズムによって解くことで，データに対して所望の処理を行うことが多い．最適化問題は複数の項が含まれると解くのが難しくなるが，近年の近接分離最適化手法は各項を分離して各々解くのみで良く，複雑な問題を解くのに適している．本研究では，これまで提案されている音響信号処理手法に用いられた最適化の手続きをヒューリスティックに拡張することで，従来の定式化にとらわれない新たな信号処理アルゴリズムを実現した．具体的には，マルチチャネル音源分離を行う近接分離アルゴリズムの近接作用素を一般化することで，近接作用素として定式化困難な操作の導入を実現した．

## 現在担当している科目

科目名 | 開講学部・研究科 | 開講年度 | 学期 |
---|---|---|---|

メディア処理技術概論 | 基幹理工学部 | 2020 | 春学期 |

メディア処理技術概論 | 創造理工学部 | 2020 | 春学期 |

メディア処理技術概論 | 先進理工学部 | 2020 | 春学期 |

Science and Engineering Laboratory 1A | 基幹理工学部 | 2020 | 春学期 |

理工学基礎実験１Ａ Vブロック | 基幹理工学部 | 2020 | 春学期 |

Science and Engineering Laboratory 1A | 創造理工学部 | 2020 | 春学期 |

理工学基礎実験１Ａ Vブロック | 創造理工学部 | 2020 | 春学期 |

Science and Engineering Laboratory 1A | 先進理工学部 | 2020 | 春学期 |

理工学基礎実験１Ａ Vブロック | 先進理工学部 | 2020 | 春学期 |

表現工学のための基礎数学 | 基幹理工学部 | 2020 | 春学期 |

応用音響 | 基幹理工学部 | 2020 | 秋学期 |

Acoustic Systems | 基幹理工学部 | 2020 | 秋学期 |

Acoustic Systems | 基幹理工学部 | 2020 | 秋学期 |

Laboratory for Advanced Science and Engineering A | 先進理工学部 | 2020 | 春クォーター |

Laboratory for Advanced Science and Engineering A | 先進理工学部 | 2020 | 春クォーター |

Laboratory for Advanced Science and Engineering A | 先進理工学部 | 2020 | 春クォーター |

Laboratory for Advanced Science and Engineering A [S Grade] | 先進理工学部 | 2020 | 春クォーター |

Laboratory for Advanced Science and Engineering A [S Grade] | 先進理工学部 | 2020 | 春クォーター |

Laboratory for Advanced Science and Engineering A [S Grade] | 先進理工学部 | 2020 | 春クォーター |

Laboratory for Advanced Science and Engineering B | 先進理工学部 | 2020 | 夏クォーター |

Laboratory for Advanced Science and Engineering B | 先進理工学部 | 2020 | 夏クォーター |

Laboratory for Advanced Science and Engineering B | 先進理工学部 | 2020 | 夏クォーター |

Laboratory for Advanced Science and Engineering B [S Grade] | 先進理工学部 | 2020 | 夏クォーター |

Laboratory for Advanced Science and Engineering B [S Grade] | 先進理工学部 | 2020 | 夏クォーター |

Laboratory for Advanced Science and Engineering B [S Grade] | 先進理工学部 | 2020 | 夏クォーター |