MOCHIZUKI, Yoshihiko

Official Title

Researcher(Associate Professor)

AffiliationFaculty of Science and Engineering

(Waseda Research Institute for Science and Engineering)

Contact Information


Grant-in-aids for Scientific Researcher Number

Educational background・Degree

Educational background

-2011 Chiba University Advanced Integration Science Information Sciences


Ph.D. in Engineering Coursework Chiba University Intelligent robotics

Research Field

Grants-in-Aid for Scientific Research classification

Informatics / Human informatics / Intelligent robotics


Defining a pairwise similarity measure based on linearity : Application to line extraction from distorted image

HINO Hideitsu;FUJIKI Jun;AKAHO Shotaro;MOCHIZUKI Yoshihiko;MURATA Noboru

Technical report of IEICE. PRMU 113(75) p.29 - 342013/06-2013/06




Outline:Data clustering is a fundamental technique in many fields of information processing including image analysis. The results of data clustering depend on both the clustering algorithm and the similarity defined between pair of the data. Focusing the distribution of the data around a line connecting a pair of data points, a method of defining a similarity between the pair of data is proposed. The similarity is used for clustering the observed dataset. The method is applied to detecting line segments from distorted images taken by cameras with a fish-eye lens.

Detection by classification of buildings in multispectral satellite imagery

Ishii, Tomohiro; Ishii, Tomohiro; Simo-Serra, Edgar; Iizuka, Satoshi; Mochizuki, Yoshihiko; Sugimoto, Akihiro; Ishikawa, Hiroshi; Nakamura, Ryosuke

Proceedings - International Conference on Pattern Recognition p.3344 - 33492017/04-2017/04




Outline:© 2016 IEEE. We present an approach for the detection of buildings in multispectral satellite images. Unlike 3-channel RGB images, satellite imagery contains additional channels corresponding to different wavelengths. Approaches that do not use all channels are unable to fully exploit these images for optimal performance. Furthermore, care must be taken due to the large bias in classes, e.g., most of the Earth is covered in water and thus it will be dominant in the images. Our approach consists of training a Convolutional Neural Network (CNN) from scratch to classify multispectral image patches taken by satellites as whether or not they belong to a class of buildings. We then adapt the classification network to detection by converting the fully-connected layers of the network to convolutional layers, which allows the network to process images of any resolution. The dataset bias is compensated by subsampling negatives and tuning the detection threshold for optimal performance. We have constructed a new dataset using images from the Landsat 8 satellite for detecting solar power plants and show our approach is able to significantly outperform the state-of-the-art. Furthermore, we provide an indepth evaluation of the seven different spectral bands provided by the satellite images and show it is critical to combine them to obtain good results.

Room reconstruction from a single spherical image by higher-order energy minimization

Fukano, Kosuke; Mochizuki, Yoshihiko; Iizuka, Satoshi; Simo-Serra, Edgar; Sugimoto, Akihiro; Ishikawa, Hiroshi

Proceedings - International Conference on Pattern Recognition p.1768 - 17732017/04-2017/04




Outline:© 2016 IEEE. We propose a method for understanding a room from a single spherical image, i.e., reconstructing and identifying structural planes forming the ceiling, the floor, and the walls in a room. A spherical image records the light that falls onto a single viewpoint from all directions and does not require correlating geometrical information from multiple images, which facilitates robust and precise reconstruction of the room structure. In our method, we detect line segments from a given image, and classify them into two groups: segments that form the boundaries of the structural planes and those that do not. We formulate this problem as a higher-order energy minimization problem that combines the various measures of likelihood that one, two, or three line segments are part of the boundary. We minimize the energy with graph cuts to identify segments forming boundaries, from which we estimate structural the planes in 3D. Experimental results on synthetic and real images confirm the effectiveness of the proposed method.

Multiple-organ segmentation by graph cuts with supervoxel nodes

Takaoka, Toshiya; Mochizuki, Yoshihiko; Ishikawa, Hiroshi

Proceedings of the 15th IAPR International Conference on Machine Vision Applications, MVA 2017 p.424 - 4272017/07-2017/07



Outline:© 2017 MVA Organization All Rights Reserved. Improvement in medical imaging technologies has made it possible for doctors to directly look into patients' bodies in ever finer details. However, since only the cross-sectional image can be directly seen, it is essential to segment the volume into organs so that their shape can be seen as 3D graphics of the organ boundary surfaces. Segmentation is also important for quantitative measurement for diagnosis. Here, we introduce a novel higher-precision method to segment multiple organs using graph cuts within medical images such as CT-scanned images. We utilize super voxels instead of voxels as the units of segmentation, i.e., the nodes in the graphical model, and design the energy function to minimize accordingly. We utilize SLIC super voxel algorithm and verify the performance of our segmentation algorithm by energy minimization comparing to the ground truth.

Unsupervised video object segmentation by supertrajectory labeling

Masuda, Masahiro; Mochizuki, Yoshihiko; Ishikawa, Hiroshi

Proceedings of the 15th IAPR International Conference on Machine Vision Applications, MVA 2017 p.448 - 4512017/07-2017/07



Outline:© 2017 MVA Organization All Rights Reserved. We propose a novel approach to unsupervised video segmentation based on the trajectories of Temporal Super-pixels (TSPs). We cast the segmentation problem as a trajectory-labeling problem and define a Markov random field on a graph in which each node represents a trajectory of TSPs, which we minimize using a new two-stage optimization method we developed. The adaption of the trajectories as basic building blocks brings several advantages over conventional superpixel-based methods, such as more expressive potential functions, temporal coherence of the resulting segmentation, and drastically reduced number of the MRF nodes. The most important effect is, however, that it allows more robust segmentation of the foreground that is static in some frames. The method is evaluated on a subset of the standard SegTrack benchmark and yields competitive results against the state-of-the-art methods.

Banknote portrait detection using convolutional neural network

Kitagawa, Ryutaro; Mochizuki, Yoshihiko; Iizuka, Satoshi; Simo-Serra, Edgar; Matsuki, Hiroshi; Natori, Naotake; Ishikawa, Hiroshi

Proceedings of the 15th IAPR International Conference on Machine Vision Applications, MVA 2017 p.440 - 4432017/07-2017/07



Outline:© 2017 MVA Organization All Rights Reserved. Banknotes generally have different designs according to their denominations. Thus, if characteristics of each design can be recognized, they can be used for sorting banknotes according to denominations. Portrait in banknotes is one such characteristic that can be used for classification. A sorting system for banknotes can be designed that recognizes portraits in each banknote and sort it accordingly. In this paper, our aim is to automate the configuration of such a sorting system by automatically detect portraits in sample banknotes, so that it can be quickly deployed in a new target country. We use Convolutional Neural Networks to detect portraits in completely new set of banknotes robust to variation in the ways they are shown, such as the size and the orientation of the face.

Research Grants & Projects

Grant-in-aids for Scientific Research Adoption Situation

Research Classification:

Energy design for high dimensional image processing based on higher order energy minimization


Allocation Class:¥3640000

Research Classification:

Scale-space analysis for point sets on the sphere and applications to computer vision


Allocation Class:¥2730000

On-campus Research System

Special Research Project



Research Results Outline:本研究の目的は、グラフカットによるエネルギー最小化問題を考える際に必要となるエネルギー定義を自動的に決定することである。一般に、高階エネルギーと呼ばれ本研究の目的は、グラフカットによるエネルギー最小化問題を考える際に必要となるエネルギー定義を自動的に決定することである。一般に、高階エネルギーと呼ばれるものはより良い推定を与えることが言われているが、実際にどのようなエネルギーを作るべきかは明確な指...本研究の目的は、グラフカットによるエネルギー最小化問題を考える際に必要となるエネルギー定義を自動的に決定することである。一般に、高階エネルギーと呼ばれるものはより良い推定を与えることが言われているが、実際にどのようなエネルギーを作るべきかは明確な指標がない。そこで、エネルギー設計を行う上で役立つエネルギーの良さを与える指標の開発が必要である。今回は、高階エネルギーを用いたセグメンテーション手法を主要な研究対象とし、様々なエネルギーの構成を検討した。一つは、2つのボクセルに対するCT値を条件とするもの、また位置を条件とするもの、3つ以上のボクセルに対するものなどである。このようなもので定義されるエネルギーはボクセルの配置などを様々に変化させる必要があるため、組み合わせが膨大である。したがって、実用的にはそのうちどれが重要であるかを見極める必要がある。理論的にこれを調べるために、複数の情報量を検討した。今回はそれぞれのボクセルがとるラベルを確率変数として、その相互情報量を計算することで、様々な条件がどのように影響し、どの条件を採用するべきかの指標とする手法を開発した。現段階では理論的な検討であるが、今後実際にエネルギーとして採用し、セグメンテーション結果を評価する実験を行う必要がある。また、高階化する方法についてはまだ見当の余地があるため、今後も継続的に調査を続ける。また、エネルギーの各項は通常確率分布から計算されるので、その単純和をとることでエネルギーは定義されるが、一部確率分布では表現されない項が含まれると、各項のバランスをとるために重みを考慮する必要がある。このような超パラメタを設定するために、推定結果から、その結果の良さを評価する手法を開発した。これは、推定結果と真値との比較を行い、統計的な量との相関が高いものを利用して計算されるものである。これにより、複数のセグメンテーション結果を最良なものに選択することができる。この手法では、確率分布としての視点からの考察が終わっていないため、今後は重みと確率分布との解釈を明確にしていきたい。

Lecture Course

Course TitleSchoolYearTerm
Computer Science and Engineering LaboratorySchool of Fundamental Science and Engineering2018fall semester
Computer Science and Engineering LaboratorySchool of Fundamental Science and Engineering2018fall semester