Statistica Sinica: Volume 29, Number 3, July 2019This is an example of an RSS feedhttp://www3.stat.sinica.edu.tw/statistica/Thu, 27 June 2019 00:01:00 +0000 Thu, 27 June 2019 00:01:00 +00001800
/statistica/J29N3/J29N31/J29N31.html
A MODEL FOR LARGE MULTIVARIATE SPATIAL DATA SETS William Kleiber, Douglas Nychka and Soutir Bandyopadhyay 1085-1104<span style='font-size=12pt;'><center>Abstract</center> Multivariate spatial modeling is a rapidly growing field; however, most extant models are infeasible for use with massive spatial processes. In this work, we introduce a highly flexible, interpretable, and scalable multiresolution approach to multivariate spatial modeling. Compactly supported basis functions and Gaussian Markov random field specifications for the coefficients yield efficient and scalable calculation routines for likelihood evaluations and co-kriging. We analytically show that special parameterizations approximate popular existing models. Moreover, the multiresolution approach allows for an arbitrary specification of scale dependence between processes. We use Monte Carlo studies to illustrate the implied stochastic behavior of our approach and to test our ability to recover scale dependence. Moreover, we examine a complex large bivariate observational minimum and maximum temperature data set for the western United States. <p>Key words and phrases: Coherence, multiresolution, scale dependence, sparse, Wendland.</span>
/statistica/J29N3/J29N310/J29N310.html
LE CAM MAXIMIN TESTS FOR SYMMETRY OF CIRCULAR DATA BASED ON THE CHARACTERISTIC FUNCTION Simos Meintanis and Thomas Verdebout 1301-1320<span style='font-size=12pt;'><center>Abstract</center> We consider asymptotic inferences for circular data based on empirical characteristic functions. More precisely, we provide tests for reflective symmetry of circular data based on the imaginary part of the empirical characteristic function. We show that the proposed tests have many attractive features including the property of being locally and asymptotically maximin in the Le Cam sense under sine-skewed alternatives in the specified mean direction case. To the best of our knowledge, this result provides the first instance of such an optimality property for empirical characteristic functions. For the unspecified mean direction case, we provide corrected versions of the original tests that retain nice asymptotic power properties. The results are illustrated using a well-known data set and are checkedusing Monte-Carlo simulations. <p>Key words and phrases: Characteristic function, directional statistics, reflectivesymmetry.</span>
/statistica/J29N3/J29N311/J29N311.html
EMPIRICAL LIKELIHOOD ESTIMATION USING AUXILIARY SUMMARY INFORMATION WITH DIFFERENT COVARIATE DISTRIBUTIONS Peisong Han and Jerald F. Lawless 1321-1342<span style='font-size=12pt;'><center>Abstract</center> The potential use of auxiliary summary information to improve the efficiency of estimation has attracted significant interest. Most existing methods assume that the data distribution is the same for the sample data and for the population that generates the auxiliary information. However, recent works have relaxed this assumption by allowing heterogeneity between the two covariate distributions. We consider an empirical likelihood approach that guarantees that using auxiliary information will increase the efficiency of estimation when the variability associated with this information is sufficiently small. We also investigate the effects of this variability on the efficiency. Furthermore, we implement the proposed approach using a Newton-Raphson-type algorithm. Lastly, we discuss our simulation results, which demonstrate the efficiency gains and confirm the large sample approximations. <p>Key words and phrases: Auxiliary information, data integration, empirical likelihood, estimation efficiency, information uncertainty, summary information.</span>
/statistica/J29N3/J29N312/J29N312.html
CRITERIA FOR MULTIPLE SURROGATES Peng Luo, Zheng Cai and Zhi Geng 1343-1366<span style='font-size=12pt;'><center>Abstract</center> An observed surrogate endpoint is often used to predict a treatment effect on an unobserved true endpoint when it is difficult or expensive to measure the true endpoint. Although several criteria have been proposed for identifying surrogate endpoints, they all suffer from the surrogate paradox: a treatment has a positive effect on the surrogate and the surrogate has a positive effect on the endpoint; however the treatment has a negative effect on the endpoint. To avoid this paradox, criteria have been proposed for a single surrogate that blocks the path from the treatment to the endpoint. This requires that there is a single path from the treatment to the endpoint and that the surrogate can block this path. However, in many applications, a treatment may affect an endpoint through several paths. Therefore, we use stochastic orders of random vectors to derive criteria for multiple surrogates that avoid the surrogate paradox and can be used to predict the sign of the treatment effect on the unobserved true endpoint. Furthermore under the conditional independence of the treatment and the true endpoint, given the multiple surrogates, we propose sufficient conditions for the sign-equivalence of the treatment effects on the surrogates and on the true endpoint. Lastly, we illustrate how these criteria can be applied to several commonly used models. <p>Key words and phrases: Average causal effect, prentice criteria, stochastic order, surrogate paradox.</span>
/statistica/J29N3/J29N313/J29N313.html
COMPOSITE ESTIMATION: AN ASYMPTOTICALLY WEIGHTED LEAST SQUARES APPROACH Lu Lin, Feng Li, Kangning Wang and Lixing Zhu 1367-1393<span style='font-size=12pt;'><center>Abstract</center> The purpose of this study is three-fold. First, based on an asymptotic presentation of initial estimators and model-independent parameters, either hidden in the model or combined with the initial estimators, a pro forma linear regression between the initial estimators and the parameters is defined in an asymptotic sense. Then, a weighted least squares estimation is constructed within this framework. Second, systematic studies are conducted to examine when both the variance and and the bias can be reduced simultaneously, and when only the variance can be reduced. Third, a generic rule for constructing a composite estimation and unified theoretical properties is introduced. Important examples, such as a quantile regression, nonparametric kernel estimation, and blockwise empirical likelihood estimation, are investigated to explain the methodology and theory. Simulations are conducted to examine the performance of the proposed method in finite sample situations and a real-data set is analyzed as an illustration. Lastly, the proposed method is compared to existing competitors. <p>Key words and phrases: Asymptotic representation, composite quantile regression, model-independent parameter, nonparametric regression, weighted least squares.</span>
/statistica/J29N3/J29N314/J29N314.html
COMPOSITE ESTIMATION: AN ASYMPTOTICALLY WEIGHTED LEAST SQUARES APPROACH Lu Lin, Feng Li, Kangning Wang and Lixing Zhu 1367-1393<span style='font-size=12pt;'><center>Abstract</center> We consider a setting in which we construct a binary classifier from a panel of features in order to optimize either the sensitivity at a fixed specificity level or the area under the partial receiver operating characteristic (ROC) curve. To this end, we propose an efficient iterative numerical algorithm to solve a simple constrained optimization problem that mimics the original target. We also present the associated asymptotic statistical inference procedures, including the construction of the credible intervals for the realized sensitivity/specificity or the area under the partial ROC curve of the estimated risk scores. We apply the method to simulated data sets and show that the proposed method outperforms the classifiers based on the generic logistic regression, without considering the specific criterion we want to optimize. We also apply the new proposed method to two real-data examples. <p>Key words and phrases: Feature ensemble, ROC curve, sensitivity, specificity.</span>
/statistica/J29N3/J29N315/J29N315.html
OPTIMAL PAIRED CHOICE BLOCK DESIGNS Rakhi Singh, Ashish Das and Feng-Shun Chai 1419-1438<span style='font-size=12pt;'><center>Abstract</center> Choice experiments help manufacturers, service providers, policymakers, and other researchers to make business decisions. Traditionally, in a discrete-choice experiment, each respondent is shown the same collection of choice pairs (i.e., the choice design). In addition, as the number of attributes and/or the number of levels under each attribute increases, the number of choice pairs in an optimal paired choice design increases rapidly. Moreover, in the literature on utility-neutral setups, random subsets of theoretically obtained optimal designs are often allocated to respondents. This raises the question of whether we can do better than simply using a random allocation of subsets. We answer this question using a linear paired-comparison model (or, equivalently, a multinomial logit model), where we first incorporate the fixed respondent effects (also referred to as the block effects), and then obtain optimal designs for the parameters of interest. Our approach is simple and theoretically tractable, unlike other approaches that are algorithmic in nature. We present several constructions of optimal block designs that can be used to estimate main effects or main plus two-factor interaction effects. Our results show when and how an optimal design for the model without blocks can be split into blocks such that the optimality properties are retained under the block model. <p>Key words and phrases: Choice experiment, hadamard matrix, linear paired comparison model, multinomial logit model, orthogonal array, utility-neutral setup.</span>
/statistica/J29N3/J29N316/J29N316.html
OPTIMAL PAIRED CHOICE BLOCK DESIGNS Rakhi Singh, Ashish Das and Feng-Shun Chai 1419-1438<span style='font-size=12pt;'><center>Abstract</center> Quantile regression estimators at a fixed quantile level rely mainly on a small subset of the observed data. As a result, efforts have been made to construct simultaneous estimations at multiple quantile levels in order to take full advantage of all observations and to improve the estimation efficiency. We propose a novel approach that links multiple linear quantile models by imposing a condition on the rank of the matrix formed by all of the regression parameters. This approach resembles a reduced-rank regression, but also shares similarities with the dimension-reduction modeling. We develop estimation and inference tools for such models and examine their optimality in terms of the asymptotic estimation variance. We use simulation experiments to examine the numerical performance of the proposed procedure, and a data example to further illustrate the method.<p>Key words and phrases: Check function, composite quantile regression, generalized method of moment, linear quantile regression, optimal estimating equations, quantile regression, reduced-rank regression.</span>
/statistica/J29N3/J29N317/J29N317.html
THE SEMI-PARAMETRIC BERNSTEIN-VON MISES THEOREM FOR REGRESSION MODELS WITH SYMMETRIC ERRORS Minwoo Chae, Yongdai Kim and Bas J. K. Kleijn 1465-1487<span style='font-size=12pt;'><center>Abstract</center> In a smooth semi-parametric model, the marginal posterior distribution of a finite-dimensional parameter of interest is expected to be asymptotically equivalent to the sampling distribution of any efficient point estimator. This assertion leads to asymptotic equivalence of the credible and confidence sets of the parameter of interest, and is known as the semi-parametric Bernstein-von Mises theorem. In recent years, this theorem has received much attention and has been widely applied. Here, we consider models in which errors with symmetric densities play a role. Specifically, we show that the marginal posterior distributions of the regression coefficients in linear regression and linear mixed-effect models satisfy the semi-parametric Bernstein-von Mises assertion. As a result, Bayes estimators in these models achieve frequentist inferential optimality, as expressed, for example, in Hájek's convolution and asymptotic minimax theorems. For the prior on the space of error densities, we provide two well-known examples, namely, the Dirichlet process mixture of normal densities and random series priors. The results provide efficient estimates of the regression coefficients in the linear mixed-effect model, for which no efficient point estimators currently exist. <p>Key words and phrases: Bernstein-von mises theorem, linear mixed-effect model, linear regression, semi-parametric efficiency, symmetric error.</span>
/statistica/J29N3/J29N318/J29N318.html
SEMIPARAMETRIC REGRESSION MODEL FOR RECURRENT BACTERIAL INFECTIONS AFTER HEMATOPOIETIC STEM CELL TRANSPLANTATION Chi Hyun Lee, Chiung-Yu Huang, Todd E. DeFor, Claudio G. Brunstein, Daniel J. Weisdorf and Xianghua Luo 1489-1509<span style='font-size=12pt;'><center>Abstract</center> Patients who undergo hematopoietic stem cell transplantation (HSCT) often experience multiple bacterial infections during the early post-transplant period. In this article, we consider a semiparametric regression model that correlates patient-and transplant-related risk factors with inter-infection gap times. Existing regression methods for recurrent gap times are not directly applicable to studies of post-transplant infections because the initiating event (i.e., the transplant) is different to the recurrent events of interest (i.e., post-transplant infections). As a result, the time between a transplant and the first infection and that between consecutive infections have distinct biological meanings and, hence, follow different distributions. Moreover, risk factors may have different effects on these two types of gap times. Therefore, we propose a semiparametric estimation procedure that lets us simultaneously evaluate the covariate effects on the time between a transplant and the first infection and on the gap times between consecutive infections. The proposed estimator accounts for dependent censoring induced by within-subject correlation between recurrent gap times and length bias in the last censored gap time due to intercept sampling. We study the finite sample properties through simulations and apply the proposed method to post-HSCT bacterial infection data collected at the University of Minnesota. <p>Key words and phrases: Accelerated failure time model, gap times, recurrent events, semiparametric method, weighted risk-set method.</span>
/statistica/J29N3/J29N319/J29N319.html
AN ADAPTIVE-TO-MODEL TEST FOR PARAMETRIC SINGLE-INDEX ERRORS-IN-VARIABLES MODELS Hira L. Koul, Chuanlong Xie and Lixing Zhu 1511-1534<span style='font-size=12pt;'><center>Abstract</center> This study provides a useful test for parametric single-index regression models when covariates are measured with errors and validation data are available. The proposed test is asymptotically unbiased, and its consistency rate does not depend on the dimension of the covariate vector. The proposed test behaves like a classical local smoothing test with only one covariate, and retains the omnibus property against general alternatives. This suggests that the proposed test can potentially alleviate the difficulty associated with the curse of dimensionality in this field. Furthermore, a systematic study is conducted to investigate the effect of the ratio between the sample size and the size of the validation data on the asymptotic behavior of these tests. Lastly, simulations are conducted to examine the performance in several finite sample scenarios. <p>Key words and phrases: Adaptive-to-model test, dimension reduction, errors-in-variables model.</span>
/statistica/J29N3/J29N32/J29N32.html
A STOCHASTIC GENERATOR OF GLOBAL MONTHLY WIND ENERGY WITH TUKEY <I>g</I>-AND-<I>h</I> AUTOREGRESSIVE PROCESSES Jaehong Jeong, Yuan Yan, Stefano Castruccio and Marc G. Genton 1105-1126<span style='font-size=12pt;'><center>Abstract</center> Quantifying the uncertainty of wind energy potential from climate models is a time-consuming task and requires considerable computational resources. A statistical model trained on a small set of runs can act as a stochastic approximation of the original climate model, and can assess the uncertainty considerably faster than by resorting to the original climate model for additional runs. While Gaussian models have been widely employed as means to approximate climate simulations, the Gaussianity assumption is not suitable for winds at policy-relevant (i.e., subannual) time scales. We propose a trans-Gaussian model for monthly wind speed that relies on an autoregressive structure with a Tukey g-and-h transformation, a flexible new class that can separately model skewness and tail behavior. This temporal structure is integrated into a multi-step spectral framework that can account for global nonstationarities across land/ocean boundaries, as well as across mountain ranges. Inferences are achieved by balancing memory storage and distributed computation for a big data set of 220 million points. Once the statistical model was fitted using as few as five runs, it can generate surrogates rapidly and efficiently on a simple laptop. Furthermore, it provides uncertainty assessments very close to those obtained from all available climate simulations (40) on a monthly scale.<p>Key words and phrases: Big data, nonstationarity, spatio-temporal covariance model, sphere, stochastic generator, Tukey g-and-h autoregressive model, wind energy.</span>
/statistica/J29N3/J29N320/J29N320.html
HIGH-DIMENSIONAL SEMIPARAMETRIC ESTIMATE OF LATENT COVARIANCE MATRIX FOR MATRIX-VARIATE Lu Niu and Junlong Zhao 1535-1559<span style='font-size=12pt;'><center>Abstract</center> Estimating the covariance matrix of a high-dimensional matrix-variate is an important issue. As such, many methods have been developed, typically based on the sample covariance matrix under a Gaussian or sub-Gaussian assumption. However, the sub-Gaussian assumption is restrictive and the estimate based on the sample covariance matrix is not robust. In this study, we estimate the covariance matrix of a high-dimensional matrix-variate using a transelliptical distribution and Kendall's τ correlation. Because the covariance matrix of a matrix-variate is commonly assumed to have a low-dimensional structure, we consider the structure of the Kronecker expansion. The asymptotic results of the estimator are established. Simulation results and a real-data analysis confirm the effectiveness of our method. <p>Key words and phrases: Kronecker product, latent covariance (correlation) matrix, matrix-variate, robust estimate.</span>
/statistica/J29N3/J29N321/J29N321.html
GENERALIZED LINEAR CEPSTRAL MODELS FOR THE SPECTRUM OF A TIME SERIES Tommaso Proietti and Alessandra Luati 1561-1583<span style='font-size=12pt;'><center>Abstract</center> This paper introduces a class of generalized linear models with a Box-Cox link for the spectrum of a time series. The Box-Cox transformation of the spectral density is represented as a finite Fourier polynomial. Here, the coefficients of the polynomial, called generalized cepstral coefficients, provide a complete characterization of the properties of the random process. The link function depends on a power-transformation parameter, and can be expressed as an exponential model (logarithmic link), an autoregressive model (inverse link), or a moving average model (identity link). An advantage of this model class is the possibility of nesting alternative spectral estimation methods within the same likelihood-based framework. As a result, selecting a particular parametric spectrum is equivalent to estimating the transformation parameter. We also show that the generalized cepstral coefficients are a one-to-one function of the inverse partial autocorrelations of the process, which can be used to evaluate the mutual information between the past and the future of the process.<p>Key words and phrases: Box-Cox link, generalised linear models, mutual information, whittle likelihood.</span>
/statistica/J29N3/J29N322/J29N322.html
MM ALGORITHMS FOR VARIANCE COMPONENT ESTIMATION AND SELECTION IN LOGISTIC LINEAR MIXED MODEL Liuyi Hu, Wenbin Lu, Jin Zhou and Hua Zhou 1585-1605<span style='font-size=12pt;'><center>Abstract</center> Logistic linear mixed models are widely used in experimental designs and genetic analyses of binary traits. Motivated by modern applications, we consider the case of many groups of random effects, where each group corresponds to a variance component. When the number of variance components is large, fitting a logistic linear mixed model is challenging. Thus, we develop two efficient and stable minorization-maximization (MM) algorithms for estimating variance components based on a Laplace approximation of the logistic model. One of these leads to a simple iterative soft-thresholding algorithm for variance component selection using the maximum penalized approximated likelihood. We demonstrate the variance component estimation and selection performance of our algorithms by means of simulation studies and an analysis of real data. <p>Key words and phrases: Generalized linear mixed model (GLMM), Laplace approximation, MM algorithm, variance components selection.</span>
/statistica/J29N3/J29N323/J29N323-next.html
THE RESTRICTED CONSISTENCY PROPERTY OF LEAVE-𝓃<sub>𝓋</sub>-OUT CROSS-VALIDATION FOR HIGH-DIMENSIONAL VARIABLE SELECTION Yang Feng and Yi Yu 1607-1630
/statistica/J29N3/J29N323/J29N323.html
THE RESTRICTED CONSISTENCY PROPERTY OF LEAVE-𝓃<sub>𝓋</sub>-OUT CROSS-VALIDATION FOR HIGH-DIMENSIONAL VARIABLE SELECTION Yang Feng and Yi Yu 1607-1630
/statistica/J29N3/J29N33/J29N33.html
SPATIAL JOINT SPECIES DISTRIBUTION MODELING USING DIRICHLET PROCESSES Shinichiro Shirota, Alan E. Gelfand and Sudipto Banerjee 1127-1154<span style='font-size=12pt;'><center>Abstract</center> Species distribution models usually attempt to explain the presence-absence or abundance of a species at a site in terms of the environmental features (so-called abiotic features) present at the site. Historically, such models have considered species individually. However, it is well established that species interact to influence the presence-absence and abundance (envisioned as biotic factors). As a result, recently joint species distribution models with various types of responses, such as presence-absence, continuous, and ordinal data have attracted a significant amount of interest. Such models incorporate the dependence between species' responses as a proxy for interaction. We address the accommodation of such modeling in the context of a large number of species (e.g., order 10<sup>2</sup>) across sites numbering in the order of 10<sup>2</sup> or 10<sup>3</sup> when, in practice, only a few species are found at any observed site. To do so, we adopt a dimension-reduction approach. The novelty of our approach is that we add spatial dependence. That is, we consider a collection of sites over a relatively small spatial region. As such, we anticipate that the species distribution at a given site will be similar to that at a nearby site. Specifically, we handle dimension reduction using Dirichlet processes, which enables the clustering of species, and add spatial dependence across sites using Gaussian processes. We use simulated data and a plant communities data set for the Cape Floristic Region (CFR) of South Africa to demonstrate our approach. The latter consists of presence-absence measurements for 639 tree species at 662 locations. These two examples demonstrate the improved predictive performance of our method using the aforementioned specification. <p>Key words and phrases: Dimension reduction; Gaussian processes; high-dimensional covariance matrix; spatial factor model; species dependence.</span>
/statistica/J29N3/J29N34/J29N34.html
SPATIAL FACTOR MODELS FOR HIGH-DIMENSIONAL AND LARGE SPATIAL DATA: AN APPLICATION IN FOREST VARIABLE MAPPING Daniel Taylor-Rodriguez, Andrew O. Finley, Abhirup Datta, Chad Babcock, Hans-Erik Andersen, Bruce D. Cook, Douglas C. Morton6 and Sudipto Banerjee 1155-1180<span style='font-size=12pt;'><center>Abstract</center> Gathering information about forest variables is an expensive and arduous activity. Therefore, directly collecting the data required to produce high-resolution maps over large spatial domains is infeasible. Next-generation collection initiatives for remotely sensed light detection and ranging (LiDAR) data are specifically aimed at producing complete-coverage maps over large spatial domains. Given that LiDAR data and forest characteristics are often strongly correlated, it is possible to use the former to model, predict, and map forest variables over regions of interest. This entails dealing with high-dimensional (~10<sup>2</sup>) spatially dependent LiDAR outcomes over a large number of locations (~10<sup>5</sup> - 10<sup>6</sup>). With this in mind, we develop the spatial factor nearest neighbor Gaussian process (SF-NNGP) model, which we embed in a two-stage approach that connects the spatial structure found in LiDAR signals with forest variables. We provide a simulation experiment that demonstrates the inferential and predictive performance of the SF-NNGP, and use the two-stage modeling strategy to generate complete-coverage maps of the forest variables, with associated uncertainty, over a large region of boreal forests in interior Alaska.<p>Key words and phrases: Forest outcomes, LiDAR data, nearest neighbor Gaussian processes, spatial prediction.</span>
/statistica/J29N3/J29N35/J29N35.html
SPATIO-TEMPORAL MODELS WITH SPACE-TIME INTERACTION AND THEIR APPLICATIONS TO AIR POLLUTION DATA Soudeep Deb and Ruey S. Tsay 1181-1207<span style='font-size=12pt;'><center>Abstract</center> It is important to have a clear understanding of the status of air pollution and to provide forecasts and insights related to air quality to both the public and environmental researchers. Previous studies have shown that even a short-term exposure to high concentrations of atmospheric fine particulate matter can be hazardous to people's health. In this study, we develop a spatio-temporal model with space-time interaction for air pollution data (PM<sub>2.5</sub> ). Along with the spatial and temporal components, the proposed model uses a parametric space-time interaction component in the mean structure, as well as a random-effects component specified in the form of zero-mean spatio-temporal processes. To apply the model, we analyze air pollution data (PM<sub>2.5</sub>) from 66 monitoring stations across Taiwan. <p>Key words and phrases: Dynamical dependence, fine particulate matter, Lagrange multiplier test, spatial dependence.</span>
/statistica/J29N3/J29N36/J29N36.html
EFFICIENT ESTIMATION OF NONSTATIONARY SPATIAL COVARIANCE FUNCTIONS WITH APPLICATION TO HIGH-RESOLUTION CLIMATE MODEL EMULATION Yuxiao Li and Ying Sun 1209-1231<span style='font-size=12pt;'><center>Abstract</center> Spatial processes exhibit nonstationarity in many climate and environmental applications. Convolution-based approaches are often used to construct nonstationary covariance functions in Gaussian processes. Although convolution-based models are flexible, their computation is extremely expensive when the data set is large. Most existing methods rely on fitting an anisotropic, but stationary model locally, and then reconstructing the spatially varying parameters. In this study, we propose a new estimation procedure to approximate a class of nonstationary Matérn covariance functions by local-polynomial fitting the covariance parameters. The proposed method allows for efficient estimation of a richer class of nonstationary covariance functions, with the local stationary model as a special case. We also develop an approach for a fast high-resolution simulation with nonstationary features on a small scale and apply it to precipitation data in climate model outputs. <p>Key words and phrases: Climate model runs, conditional simulation, large datasets,local likelihood estimation, nonstationary Matérn covariance function, polynomial approximation.</span>
/statistica/J29N3/J29N37/J29N37.html
SEMIPARAMETRIC MODELING WITH NONSEPARABLE AND NONSTATIONARY SPATIO-TEMPORAL COVARIANCE FUNCTIONS AND ITS INFERENCE Tingjin Chu, Jun Zhu and Haonan Wang 1233-1252<span style='font-size=12pt;'><center>Abstract</center> In this study, we develop a new semiparametric approach to model geostatistical data measured repeatedly over time. In addition, we draw inferences about the parameters and components of the underlying spatio-temporal process. Dependence in time and across space is modeled semiparametrically, giving rise to a class of nonseparable and nonstationary spatio-temporal covariance functions. A two-step procedure is devised to estimate the model parameters based on the likelihood of detrended data, and the computational algorithm is efficient owing to the dimension reduction. Extensions to spatio-temporal processes with general mean trends are also considered. Furthermore, the asymptotic properties of our proposed method are established, including consistency and asymptotic normality. A simulation study shows the sound finite-sample properties of the proposed method, and a real-data example is used to compare our method with alternative approaches. <p>Key words and phrases: Geostatistics, semiparametric methods, spatio-temporalprocesses.</span>
/statistica/J29N3/J29N38/J29N38.html
A TEST FOR ISOTROPY ON A SPHERE USING SPHERICAL HARMONIC FUNCTIONS Indranil Sahoo, Joseph Guinness and Brian J. Reich 1253-1276<span style='font-size=12pt;'><center>Abstract</center> Analyses of geostatistical data are often based on the assumption that the spatial random field is isotropic. This assumption, if erroneous, can adversely affect model predictions and statistical inferences. Today, many applications consider global data, and hence, it is necessary to check the assumption of isotropy on a sphere. This study proposes a test for spatial isotropy on a sphere. The data are first projected onto the set of spherical harmonic functions. Under isotropy, the spherical harmonic coefficients are uncorrelated, but are correlated if the underlying fields are not isotropic. This motivates a test based on the sample correlation matrix of the spherical harmonic coefficients. In particular, we use the largest eigenvalue of this matrix as the test statistic. Extensive simulations are conducted to assess the Type-I errors of the test under different scenarios. Our method requires temporal replication in the data and, hence, is applicable to many data sets in the Earth sciences. We show how temporal correlation affects the test and provide a method for handling such correlation. We also gauge the power of the test as we move away from isotropy. The method is applied to near-surface air temperature data, which is part of the HadCM3 model output. Although we do not expect global temperature fields to be isotropic, we propose several anisotropic models, with increasing complexity, each of which has an isotropic process as a model component. Then, we apply the test to the isotropic component in a sequence of such models to determine how well the models capture the anisotropy in the fields.<p>Key words and phrases: Anisotropy, spatial statistics, spherical harmonic representation.</span>
/statistica/J29N3/J29N39/J29N39.html
A POPULARITY-SCALED LATENT SPACE MODEL FOR LARGE-SCALE DIRECTED SOCIAL NETWORK Xiangyu Chang, Danyang Huang and Hansheng Wang 1277-1299<span style='font-size=12pt;'><center>Abstract</center> Large-scale directed social network data often include degree heterogeneity, reciprocity, and transitivity properties. Thus, a sensible network-generating model should consider these features. To this end, we propose a popularity-scaled latent space model for large-scale directed network structure formulations. This model assumes each node occupies a position in a hypothetically assumed latent space. Then, the nodes close to (far away from) each other should have a higher (lower) probability of being connected. Thus, reciprocity and transitivity can be derived analytically. In addition, we assume a popularity parameter for each node. Nodes with larger (smaller) popularity are more (less) likely to be followed. By assuming different distributions for the popularity parameters, we model various types of degree heterogeneity. Based on the proposed model, we construct a comprehensive probabilistic index for link prediction. We demonstrate the performance of the proposed model using simulation studies and a Sina Weibo data set. The results show that the performance of the model is competitive. <p>Key words and phrases: Degree heterogeneity, large-scale social network, latentspace model, link prediction, reciprocity, transitivity.</span>