ABSTRACTS

Paper Abstracts

Making Clustering Methods Workable for Shapes Using the Ordinary Procrustes Sum of Squares Kazunori Iwata, Yuki Nontani, and Kazushi Mimura Pattern Recognition, in press.: Abstract
In the last few decades, a number of clustering methods for vectorial data have matured. Occasionally, researchers are interested in making them workable for data of other types, such as an object's shape. The current difficulty in applying them to shape clustering is that clustering must be invariant to several similarity transformations of shapes. Therefore, we use the ordinary Procrustes sum of squares (OSS) in Procrustes analysis, which refers to analysis that is invariant to the similarity transformations between shapes. Thus, in this study, we aim to make mature methods workable in shape clustering. To achieve this, the essential points that we need to address are to rewrite the optimization problem associated with each method and to present a feasible method for computing the solution to the problem. As a result, we base the method on the OSS, its derivative, or a symmetric matrix concerning the OSS. Another aim of this study is to identify specific applications of shape clustering. We demonstrate the applications through experiments using datasets that contain skewed line drawings, American football formations, and baseball pitch trajectories. We examine the OSS-based methods by considering shape classification and determining the best method for implementing the OSS. As a result, regardless of the dataset, we demonstrate that the OSS in every method is more effective than other shape distances and that the convex clustering method incorporating the OSS is best performed in several ways.
Keywords
ordinary Procrustes sum of squares, K-means clustering, fuzzy c-means clustering, spectral clustering, mean shift clustering, convex clustering

[B A C K]

Mixture Density Function Estimation in Shape Clustering Kazunori Iwata IEEE Transactions on Artificial Intelligence, in press.: Abstract
Recent developments in measurement tools have made it easier to obtain shape data, a collection of point coordinates in vector space that are meaningful when some of them are gathered together. As a result, clustering of shape data becomes increasingly important. However, few studies still perform applicable clustering in various cases because some studies rely on their specific shape representations. Thus, we apply a simple and widely recognized representation and generative model to shape. A configuration matrix of the point coordinates is used for the representation, and it is the simplest and most well-accepted representation in conventional shape analysis. As a generative model, we consider the mixture density function, a well-known model in statistics for expressing a population density function, which is a linear combination of subpopulation density functions. The aim of this paper is to present a mixture density-based model that will be useful for clustering shape data. The clustering of shapes involves estimating the parameters of the model, and this estimation is derived using an EM algorithm based on the model. As examples of promising shape-data applications, the computational analyses of ape skulls, American football formations, and baseball pitches were performed. In addition, we evaluated the performance of the EM algorithm by comparing it with other typical clustering methods. The theoretical results not only contribute to statistical estimation for shape data but also extend the clustering of non-vector shape data. The experimental results show that the derived EM algorithm performs well in shape clustering.
Keywords
clustering, expectation-maximization algorithm, maximum likelihood estimation, mixture density function

[B A C K]

Use of Several Non-Euclidean Metrics to Compute Distances between Every Two Points in a Plane Bounded Convex Set Kazunori Iwata Journal of Computational Science, Vol. 85, 102494, Feb. 2025.: Abstract
We consider movements between two chosen points in a plane-bounded set. The length of the movement between two points is measured by the Euclidean metric. We can then obtain the distribution of distances regarding the bounded set by repeatedly measuring the lengths of the movements. When we select two points uniformly in a bounded set, the shape of the bounded set affects the distribution. The distribution is easily computable by integral approximation and usefully remains invariant under the group of congruence transformations on the bounded set. For these reasons, some fundamental methods based on this distribution have been used in shape analysis. Thus, this distribution is important to some practical methods based on the Euclidean metric. Although some articles have been devoted to the study of non-Euclidean metrics in pattern recognition including shape analysis, only a few attempts have so far been made using the distribution of such non-Euclidean metrics. In this paper, we present the expression for the distributions with several non-Euclidean metrics using a density for the set of lines. This expression can be efficiently calculated as a sum of distributions, by generating a finite set of lines. We concentrate on a bounded convex set in the plane to define a non-Euclidean metric. In examples of such metrics on the convex set, we discuss those well-known in taxicab and projective geometries. In the experimental results using several convex sets, we visualize the distributions with the non-Euclidean metrics by plotting their graphs. Comparing the distribution based on the Euclidean metric with those based on the non-Euclidean metrics, we reveal several differences among the distributions.
Keywords
integral approximation, non-Euclidean metric, distribution of distances, plane bounded convex set, Hough transform

[B A C K]

An Extended Scheme for Shape Matching with Local Descriptors Kazunori Iwata, Hiroki Yamamoto, and Kazushi Mimura IEICE Transactions on Information and Systems, Vol. E104-D, No. 2, pp. 285-293, Feb. 2020.: Abstract
Shape matching with local descriptors is an underlying scheme in shape analysis. We can visually confirm the matching results and also assess them for shape classification. Generally, shape matching is implemented by determining the correspondence between shapes that are represented by their respective sets of sampled points. Some matching methods have already been proposed; the main difference between them lies in their choice of matching cost function. This function measures the dissimilarity between the local distribution of sampled points around a focusing point of one shape and the local distribution of sampled points around a referring point of another shape. A local descriptor is used to describe the distribution of sampled points around the point of the shape. In this paper, we propose an extended scheme for shape matching that can compensate for errors in existing local descriptors. It is convenient for local descriptors to adopt our scheme because it does not require the local descriptors to be modified. The main idea of our scheme is to consider the correspondence of neighboring sampled points to a focusing point when determining the correspondence of the focusing point. This is useful because it increases the chance of finding a suitable correspondence. However, considering the correspondence of neighboring points causes a problem regarding computational feasibility, because there is a substantial increase in the number of possible correspondences that need to be considered in shape matching. We solve this problem using a branch-and-bound algorithm, for efficient approximation. Using several shape datasets, we demonstrate that our scheme yields a more suitable matching than the conventional scheme that does not consider the correspondence of neighboring sampled points, even though our scheme requires only a small increase in execution time.
Keywords
shape matching, local shape descriptor, probability density estimator, branch-and-bound algorithm

[B A C K]

Revisiting a Nearest Neighbor Method for Shape Classification Kazunori Iwata IEICE Transactions on Information and Systems, Vol. E103-D, No. 12, pp. 2649-2658, Dec. 2020.: Abstract
The nearest neighbor method is a simple and flexible scheme for the classification of data points in a vector space. It predicts a class label of an unseen data point using a majority rule for the labels of known data points inside a neighborhood of the unseen data point. Because it sometimes achieves good performance even for complicated problems, several derivatives of it have been studied. Among them, the discriminant adaptive nearest neighbor method is particularly worth revisiting to demonstrate its application. The main idea of this method is to adjust the neighbor metric of an unseen data point to the set of known data points before label prediction. It often improves the prediction, provided the neighbor metric is adjusted well. For statistical shape analysis, shape classification attracts attention because it is a vital topic in shape analysis. However, because a shape is generally expressed as a matrix, it is non-trivial to apply the discriminant adaptive nearest neighbor method to shape classification. Thus, in this study, we develop the discriminant adaptive nearest neighbor method to make it slightly more useful in shape classification. To achieve this development, a mixture model and optimization algorithm for shape clustering are incorporated into the method. Furthermore, we describe several helpful techniques for the initial guess of the model parameters in the optimization algorithm. Using several shape datasets, we demonstrated that our method is successful for shape classification.
Keywords
shape classification, ordinary Procrustes sum of squares, nearest neighbor method, discriminant adaptive nearest neighbor method

[B A C K]

Sampling Shape Contours Using Optimization over a Geometric Graph Kazuya Ose, Kazunori Iwata, and Nobuo Suematsu IEICE Transactions on Information and Systems, Vol. E102-D, No. 12, pp. 2547-2556, Dec. 2019.: Abstract
Consider selecting points on a contour in the x-y plane. In shape analysis, this is frequently referred to as contour sampling. It is important to select the points such that they effectively represent the shape of the contour. Generally, the stroke order and number of strokes are informative for that purpose. Several effective methods exist for sampling contours drawn with a certain stroke order and number of strokes, such as the English alphabet or Arabic figures. However, many contours entail an uncertain stroke order and number of strokes, such as pictures of symbols, and little research has focused on methods for sampling such contours. This is because selecting the points in this case typically requires a large computational cost to check all the possible choices. In this paper, we present a sampling method that is useful regardless of whether the contours are drawn with a certain stroke order and number of strokes or not. Our sampling method thereby expands the application possibilities of contour processing. We formulate contour sampling as a discrete optimization problem that can be solved using a type of direct search. Based on a geometric graph whose vertices are the points and whose edges form rectangles, we construct an effective objective function for the problem. Using different shape datasets, we demonstrate that our sampling method is effective with respect to shape representation and retrieval.
Keywords
contour sampling, shape representation, shape retrieval, geometric graph

[B A C K]

Extending the Peak Bandwidth of Parameters for Softmax Selection in Reinforcement Learning Kazunori Iwata IEEE Transactions on Neural Networks and Learning Systems, Vol. 28, No. 8, pp. 1865-1877, Aug. 2017.: Abstract
Softmax selection is one of the most popular methods for action selection in reinforcement learning. Although various recently proposed methods may be more effective with full parameter tuning, implementing a complicated method that requires the tuning of many parameters can be difficult. Thus, softmax selection is still worth revisiting, considering the cost savings of its implementation and tuning. In fact, this method works adequately in practice with only one parameter appropriately set for the environment. The aim of this paper is to improve the variable setting of this method to extend the bandwidth of good parameters, thereby reducing the cost of implementation and parameter tuning. To achieve this, we take advantage of the asymptotic equipartition property in a Markov decision process to extend the peak bandwidth of softmax selection. Using a variety of episodic tasks, we show that our setting is effective in extending the bandwidth and that it yields a better policy in terms of stability. The bandwidth is quantitatively assessed in a series of statistical tests.
Keywords
reinforcement learning, softmax selection, parameter bandwidth, asymptotic equipartition property

[B A C K]

A Spatially Correlated Mixture Model for Image Segmentation Kosei Kurisu, Nobuo Suematsu, Kazunori Iwata, and Akira Hayashi IEICE Transactions on Information and Systems, Vol. E98-D, No. 4, pp. 930-937, Apr. 2015.: Abstract
In image segmentation, finite mixture modeling has been widely used. In its simplest form, the spatial correlation among neighboring pixels is not taken into account, and its segmentation results can be largely deteriorated by noise in images. We propose a spatially correlated mixture model in which the mixing proportions of finite mixture models are governed by a set of underlying functions defined on the image space. The spatial correlation among pixels is introduced by putting a Gaussian process prior on the underlying functions. We can set the spatial correlation rather directly and flexibly by choosing the covariance function of the Gaussian process prior. The effectiveness of our model is demonstrated by experiments with synthetic and real images.
Keywords
image segmentation, Gaussian processes, mixture models

[B A C K]

Marginalized Viterbi Algorithm for Hierarchical Hidden Markov Models Akira Hayashi, Kazunori Iwata, and Nobuo Suematsu Pattern Recognition, Vol. 46, No. 12, pp. 3452-3459, Dec. 2013.: Abstract
The generalized Viterbi algorithm, a direct extension of the Viterbi algorithm for hidden Markov models (HMMs), has been used to find the most likely state sequence for hierarchical HMMs. However, the generalized Viterbi algorithm finds the most likely whole level state sequence rather than the most likely upper level state sequence. In this paper, we propose a marginalized Viterbi algorithm, which finds the most likely upper level state sequence by marginalizing lower level state sequences. We show experimentally that the marginalized Viterbi algorithm is more accurate than the generalized Viterbi algorithm in terms of upper level state sequence estimation.
Keywords
time series data, hierarchical HMM, finding the most likely state sequence, generalized Viterbi algorithm, marginalized Viterbi algorithm

[B A C K]

Matching Handwritten Line Drawings with Von Mises Distributions Katsutoshi Ueaoki, Kazunori Iwata, Nobuo Suematsu, and Akira Hayashi IEICE Transactions on Information and Systems, Vol. E94-D, No. 12, pp. 2487-2494, Dec. 2011.: Abstract
A two-dimensional shape is generally represented with line drawings or object contours in a digital image. Shapes can be divided into two types, namely ordered and unordered shapes. An ordered shape is an ordered set of points, while an unordered shape is an unordered set. As a result, each type typically uses different attributes to define the local descriptors involved in representing the local distributions of points sampled from the shape. Throughout this paper, we focus on unordered shapes. Since most local descriptors of unordered shapes are not scale-invariant, we usually make the shapes in an image data set the same size through scale normalization, before applying shape matching procedures. Shapes obtained through scale normalization are suitable for such descriptors if the original whole shapes are similar. However, they are not suitable if parts of each original shape are drawn using different scales. Thus, in this paper, we present a scale-invariant descriptor constructed by von Mises distributions to deal with such shapes. Since this descriptor has the merits of being both scale-invariant and a probability distribution, it does not require scale normalization and can employ an arbitrary measure of probability distributions in matching shape points. In experiments on shape matching and retrieval, we show the effectiveness of our descriptor, compared to several conventional descriptors.
Keywords
shape descriptor, shape matching, shape retrieval, von Mises distribution, scale invariance

[B A C K]

An Information-Theoretic Analysis of Return Maximization in Reinforcement Learning Kazunori Iwata Neural Networks, Vol. 24, No. 10, pp. 1074-1081, Dec. 2011.: Abstract
We present a general analysis of return maximization in reinforcement learning. This analysis does not require assumptions of Markovianity, stationarity, and ergodicity for the stochastic sequential decision processes of reinforcement learning. Instead, our analysis assumes the asymptotic equipartition property fundamental to information theory, providing a substantially different view from that in the literature. As our main results, we show that return maximization is achieved by the overlap of typical and best sequence sets, and we present a class of stochastic sequential decision processes with the necessary condition for return maximization. We also describe several examples of best sequences in terms of return maximization in the class of stochastic sequential decision processes, which satisfy the necessary condition.
Keywords
reinforcement learning, stochastic sequential decision process, information theory, asymptotic equipartition property

[B A C K]

A Redundancy-Based Measure of Dissimilarity among Probability Distributions for Hierarchical Clustering Criteria Kazunori Iwata and Akira Hayashi IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 30, No. 1, pp. 76-88, Jan. 2008.: Abstract
We introduce novel dissimilarity into a probabilistic clustering task to properly measure dissimilarity among multiple clusters when each cluster is characterized by a subpopulation in the mixture model. This measure of dissimilarity is called redundancy-based dissimilarity among probability distributions. From aspects of both source coding and a statistical hypothesis test, we shed light on several of the theoretical reasons for the redundancy-based dissimilarity among probability distributions being a reasonable measure of dissimilarity among clusters. We also elucidate a principle in common for the measures of redundancy-based dissimilarity and Ward's method in terms of hierarchical clustering criteria. Moreover, we show several related theorems that are significant for clustering tasks. In the experiments, properties of the measure of redundancy-based dissimilarity are examined in comparison with several other measures.
Keywords
clustering, mixture model, dissimilarity measure, information theory, Ward's method

[B A C K]

A Statistical Property of Multiagent Learning Based on Markov Decision Process Kazunori Iwata, Kazushi Ikeda, and Hideaki Sakai IEEE Transactions on Neural Networks, Vol. 17, No. 4, pp. 829-842, July 2006.: Abstract
We exhibit an important property called the asymptotic equipartition property on empirical sequences in an ergodic multi-agent Markov decision process. Using the asymptotic equipartition property which facilitates the analysis of multi-agent learning, we give a statistical property of multi-agent learning, such as reinforcement learning, near the end of the learning process. We examine the effect of the conditions among the agents on the achievement of a cooperative policy in three different cases: blind, visible, and communicable. Also, we derive a bound on the speed with which the empirical sequence converges to the best sequence in probability, so that the multi-agent learning yields the best cooperative result.
Keywords
multi-agent system, reinforcement learning, Markov decision process, asymptotic equipartition property, stochastic complexity

[B A C K]

The Asymptotic Equipartition Property in Reinforcement Learning and its Relation to Return Maximization Kazunori Iwata, Kazushi Ikeda, and Hideaki Sakai Neural Networks, Vol. 19, No. 1, pp. 62-75, Jan. 2006.: Abstract
We discuss an important property called the asymptotic equipartition property on empirical sequences in reinforcement learning. This states that the typical set of empirical sequences has probability nearly one, that all elements in the typical set are nearly equi-probable, and that the number of elements in the typical set is an exponential function of the sum of conditional entropies if the number of time steps is sufficiently large. The sum is referred to as stochastic complexity. Using the property we elucidate the fact that the return maximization depends on two factors, the stochastic complexity and a quantity depending on the parameters of environment. Here, the return maximization means that the best sequences in terms of expected return have probability one. We also examine the sensitivity of stochastic complexity, which is a qualitative guide in tuning the parameters of action-selection strategy, and show a sufficient condition for return maximization in probability.
Keywords
reinforcement learning, Markov decision process, information theory, asymptotic equipartition property, stochastic complexity, return maximization

[B A C K]

On the Effects of Domain Size and Complexity in Empirical Distribution of Reinforcement Learning Kazunori Iwata, Kazushi Ikeda, and Hideaki Sakai IEICE Transactions on Information and Systems, Vol. E88-D, No. 1, pp. 135-142, Jan. 2005.: Abstract
We regard the events of a Markov decision process as the outputs from a Markov information source in order to analyze the randomness of an empirical sequence by the codeword length of the sequence. The randomness is an important viewpoint in reinforcement learning since the learning is to eliminate the randomness and to find an optimal policy. The occurrence of optimal empirical sequence also depends on the randomness. We then introduce the Lempel-Ziv coding for measuring the randomness which consists of the domain size and the stochastic complexity. In experimental results, we confirm that the learning and the occurrence of optimal empirical sequence depend on the randomness and show the fact that in early stages the randomness is mainly characterized by the domain size and as the number of time steps increases the randomness depends greatly on the complexity of Markov decision processes.
Keywords
reinforcement learning, Markov decision process, Lempel-Ziv coding, domain size, stochastic complexity

[B A C K]

A New Criterion Using Information Gain for Action Selection Strategy in Reinforcement Learning Kazunori Iwata, Kazushi Ikeda, and Hideaki Sakai IEEE Transactions on Neural Networks, Vol. 15, No. 4, pp. 792-799, July 2004.: Abstract
In this paper, we regard the sequence of returns as outputs from a parametric compound source. Utilizing the fact that the coding rate of the source shows the amount of information about the return, we describe $\ell$-learning algorithms based on the predictive coding idea for estimating an expected information gain concerning future information, and give a convergence proof of the information gain. Using the information gain, we propose the ratio $w$ of return loss to information gain as a new criterion to be used in probabilistic action selection strategies. In experimental results, we found that our $w$-based strategy performs well compared with the conventional Q-based strategy.
Keywords
reinforcement learning, predictive coding, information gain, probabilistic action selection strategy

[B A C K]

A Generation Method of Initial Training Data on Active Learning of Multi-layer Perceptron Kazunori Iwata and Naohiro Ishii International Journal of Knowledge-based Intelligent Engineering Systems, Vol. 6, No. 2, pp. 112-119, Apr. 2002.: Abstract
Many active learnings in the training of a partially trained MLP have been proposed. We note any active learning performance depends on initial training data. The initial training data plays an important role for active learning performance, because any active learning algorithm generates additional training data, based on existing training data. Most of conventional methods have generated initial data at random using a pseudo-random number. However, in some practical cases, we can not prepare enough data by the limit of time and cost. Therefore, the bias of initial data becomes critical, especially in the case of a small initial data. In this paper, by using $L_{2}$-discrepancy, we verify the effect of the bias of initial data for the resulting classification accuracy. We consequently confirm an advantage of low-discrepancy sequence as a generation method of initial training data, compared with a pseudo-random number. Finally, we also discuss some advantages and disadvantages of low discrepancy sequences.
Keywords
active learning, multi-layer perceptron, initial training data, pseudo-random number, low-discrepancy sequence

[B A C K]

Paper Abstracts

Kazunori Iwata, Yuki Nontani, and Kazushi Mimura

Pattern Recognition, in press.

Abstract

Keywords

Kazunori Iwata

IEEE Transactions on Artificial Intelligence, in press.

Abstract

Keywords

Kazunori Iwata

Journal of Computational Science, Vol. 85, 102494, Feb. 2025.

Abstract

Keywords

Kazunori Iwata, Hiroki Yamamoto, and Kazushi Mimura

IEICE Transactions on Information and Systems, Vol. E104-D, No. 2, pp. 285-293, Feb. 2020.

Abstract

Keywords

Kazunori Iwata

IEICE Transactions on Information and Systems, Vol. E103-D, No. 12, pp. 2649-2658, Dec. 2020.

Abstract

Keywords

Kazuya Ose, Kazunori Iwata, and Nobuo Suematsu

IEICE Transactions on Information and Systems, Vol. E102-D, No. 12, pp. 2547-2556, Dec. 2019.

Abstract

Keywords

Kazunori Iwata

IEEE Transactions on Neural Networks and Learning Systems, Vol. 28, No. 8, pp. 1865-1877, Aug. 2017.

Abstract

Keywords

Kosei Kurisu, Nobuo Suematsu, Kazunori Iwata, and Akira Hayashi

IEICE Transactions on Information and Systems, Vol. E98-D, No. 4, pp. 930-937, Apr. 2015.

Abstract

Keywords

Akira Hayashi, Kazunori Iwata, and Nobuo Suematsu

Pattern Recognition, Vol. 46, No. 12, pp. 3452-3459, Dec. 2013.

Abstract

Keywords

Katsutoshi Ueaoki, Kazunori Iwata, Nobuo Suematsu, and Akira Hayashi

IEICE Transactions on Information and Systems, Vol. E94-D, No. 12, pp. 2487-2494, Dec. 2011.

Abstract

Keywords

Kazunori Iwata

Neural Networks, Vol. 24, No. 10, pp. 1074-1081, Dec. 2011.

Abstract

Keywords

Kazunori Iwata and Akira Hayashi

IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 30, No. 1, pp. 76-88, Jan. 2008.

Abstract

Keywords

Kazunori Iwata, Kazushi Ikeda, and Hideaki Sakai

IEEE Transactions on Neural Networks, Vol. 17, No. 4, pp. 829-842, July 2006.

Abstract

Keywords

Kazunori Iwata, Kazushi Ikeda, and Hideaki Sakai

Neural Networks, Vol. 19, No. 1, pp. 62-75, Jan. 2006.

Abstract

Keywords

Kazunori Iwata, Kazushi Ikeda, and Hideaki Sakai

IEICE Transactions on Information and Systems, Vol. E88-D, No. 1, pp. 135-142, Jan. 2005.

Abstract

Keywords

Kazunori Iwata, Kazushi Ikeda, and Hideaki Sakai

IEEE Transactions on Neural Networks, Vol. 15, No. 4, pp. 792-799, July 2004.

Abstract

Keywords

Kazunori Iwata and Naohiro Ishii

International Journal of Knowledge-based Intelligent Engineering Systems, Vol. 6, No. 2, pp. 112-119, Apr. 2002.

Abstract

Keywords