1 Introduction
Nonrigid shape registration is an important area of research in medical imaging, in particular for establishing crosssubject spatial correspondence in the cerebral cortex. This type of spatial alignment has been shown to improve the statistical power of group functional MRI (fMRI) analysis Van Atteveldt et al. (2004); Frost and Goebel (2012) resulting from the improved correspondence of functional areas. Due to the geometric complexity of the cortex and the large variability between individuals, cortical surface registration remains a challenging task. Intersubject surface alignment is commonly driven by geometric features that describe measures of cortical shape (folding), such as sulcal depth or local curvature Fischl et al. (1999); Yeo et al. (2010); Conroy et al. (2013); Tardif et al. (2015).
A widely used cortical surface registration approach is to map the surface onto the unit sphere in order to perform computations in this canonical domain. Existing efforts are mainly focused on the adaptation of registration algorithms in the Euclidean space Fischl et al. (1999); Yeo et al. (2010); Robinson et al. (2014). These aim to optimize a similarity metric between the target and the deformed source volumes, regularized by various energies Sotiras et al. (2013). FreeSurfer Fischl et al. (1999)
registers an individual surface to a probabilistic atlas computed from a representative set of subjects by minimizing the squared difference between the average convexity across subjects and that of the individual, weighted by the inverse variance of the convexity across subjects. The Multimodal Surface Matching tool
Robinson et al. (2014)uses a similarity between the input and reference mesh features in a coarsetofine manner. The deformation is driven by aligning local patches around control points, i.e. vertices of a low resolution mesh, and then propagated to the high resolution input mesh via interpolation. Spherical Demons
Yeo et al. (2010) modifies the classical Demons method Thirion (1998)using velocity vectors tangent to the sphere. The twostep optimization of classical Demons also holds for the spherical case in which the second step handles the deformation regularization by spherical thin plate spline interpolation. To encourage desirable mathematical properties such as invertibility, diffeomorphic transforms have seen extensive methodological development, yielding stateoftheart tools
Ashburner (2007); Zhang et al. (2017). Unfortunately, since these methods solve an optimization problem for each image pair, they often exhibit long execution times.High computational costs have led to an increase in the popularity of supervised Krebs et al. (2017); Sokooti et al. (2017); Yang et al. (2016) and unsupervised Balakrishnan et al. (2019); Dalca et al. (2018, 2019) learningbased registration algorithms. Considering the difficulty of establishing ground truth spatial correspondences, supervised methods require predictions from existing algorithms Yang et al. (2016), simulations Sokooti et al. (2017), or both Krebs et al. (2017)
. In contrast, unsupervised methods make use of Spatial Transformer Networks (STN)
Jaderberg et al. (2015) to warp the moving image in a differentiable way, enabling endtoend training Balakrishnan et al. (2019); Dalca et al. (2019); Jason et al. (2016); Krebs et al. (2019); de Vos et al. (2019). Some unsupervised methods Balakrishnan et al. (2019); Dalca et al. (2019); Krebs et al. (2019) model a stationary velocity field as latent variables representing deformations in a generative probabilistic model. They use a scaling and squaring layer Arsigny et al. (2006) for the Lie group exponentiation of the velocity field to generate diffeomorphic transforms, thus guaranteeing topology preservation. These methods have demonstrated high quality performance in registering various types of medical images. Therefore, we build on these concepts when working with surfaces.Recent studies have developed geometric convolutional neural networks (CNN) Su and Grauman (2017); Cohen et al. (2018); Coors et al. (2018); Seong et al. (2018); Jiang et al. (2019) that operate on a spherical manifold to solve classification and detection tasks. To address the distortions introduced by projecting signals to a planar image, regular convolutions with increased kernel sizes near polar regions have been utilized Su and Grauman (2017). Spherical CNNs encode rotational instead of translational equivariance into the network in Euclidean space to solve classification problems Cohen et al. (2018). In SphereNet, the convolution kernel on the sphere has been approximated by encoding the vertex neighborhood information on 2D tangent planes, which enables adapting existing CNN architectures to the omnidirectional setup for object detection and classification tasks Coors et al. (2018). Geometric CNNs (gCNN) Seong et al. (2018) also deal with convolution and pooling operations of a CNN on a mesh surface. However, they have so far only been tested on sex classification, using cortical thickness images. A convolutional kernel discretized by an unstructured mesh was recently proposed and evaluated on spherical MNIST classification and 3D object detection tasks Jiang et al. (2019). Most existing work, however, focuses on the construction of spherical convolutional kernels and, to the best of our knowledge, neural networks have not yet been extended to surface registration. Compared with classification or detection tasks, besides convolution and pooling operations on spheres, a learningbased registration method should address local deformations defined on spheres. However, existing spatial transformation networks Esteves et al. (2018); Tai et al. (2019) for spheres only address global deformations, which are not suitable for accommodating nonlinear deformation fields.
In this paper, we propose a diffeomorphic framework combining a generative model for surfaces with CNNs to register individual cortical surfaces to an atlas space. This framework adapts conventional VoxelMorph for registering Euclidean images Dalca et al. (2019)
to spherical manifolds. In order to address the limitations of 2D planar projection, we construct a weighted neighborhood graph defined on 2D grids, which accounts for the nonuniform metric tensor of the spherical representation to encode a stationary velocity field. Considering that the 2D projection operation samples the arclength for each latitude to the same number of points, we also take sampling distortion into account at different latitudes in the likelihood model. We quantify the performance of our framework through two applications: the generation of cortical parcellations and the alignment of functional activations. The experimental results demonstrate that our framework yields better registration accuracy to stateoftheart classical methods at a significantly reduced computational cost, and more accurate results compared to current learningbased methods.
The remainder of this paper is organized as follows. We first introduce the cortical registration problem, review the conventional VoxelMorph Dalca et al. (2019) framework, and propose our method and network structure. We then describe two evaluation experiments including cortical parcellation and functional alignment, and show results for these experiments. Finally, we present our discussions and conclusions.
2 Methods
Numerous studies using FreeSurfer have demonstrated its efficacy in sphericalbased cortical registration. We build on ideas for our surface representation from the FreeSurfer spherical registration Fischl et al. (1999) and model the unsupervised learning structure for the registration field following VoxelMorph Dalca et al. (2019).
2.1 Registration problem definition
In the FreeSurfer spherical registration pipeline Fischl et al. (1999), surface geometry is encoded as a convexity attribute at each mesh vertex and the representation of the atlas surface is computed from a group of adult subjects. In order to register the surfaces of an individual to the atlas space, first a white matter mesh is generated and mapped to the unit sphere by minimizing metric distortion Fischl et al. (1999). Next, an optimal rotational alignment is computed by global search over two rotation angles on the sphere. Then a 2D canonical warp is computed to align the subject’s convexity pattern with that of the mean pattern encoded in the atlas, by minimizing the mean squared difference, weighted by the inverse of the atlas variance. Our goal is to compute this canonical warp using a CNN framework.
Let be the unit sphere and the corresponding scalar field over the sphere (e.g. sulcal depth or curvature) projected into 2D longitude/latitude parameterization. Let be the atlas mean image as defined in FreeSurfer, and and be the number of image rows and columns. Let be a diagonal matrix where each diagonal element denotes the variability of the corresponding feature at a particular vertex as defined in the FreeSurfer atlas variance image. The goal is to find the spatial transformation given and
that maximizes the a posteriori probability of the transform assuming certain smoothness priors on the warp.
2.2 VoxelMorph
We assume a diffeomorphic deformation based on a stationary velocity field , denoted as , and adapt a generative probabilistic model following VoxelMorph Dalca et al. (2019)
. VoxelMorph uses Maximum a Posteriori (MAP) estimation to obtain the most likely velocity field
at each voxel/pixel given a pair of images. VoxelMorph models the prior probability of
as a zeromean multivariate normal distribution,
, where is the covariance matrix. An individual image can be estimated by warping the FreeSurfer atlas, thus we model the warped image as , whereis the inverse transformation of the atlas warping to the individual. The aim is to maximize the posterior probability
, where the marginalization over is intractable. In this case, a variational approximation is adopted with parameters , by minimizing its dissimilarity, Kullback–Leibler (KL) divergence, with the true posterior probability. For simplicity, the approximate posterior is restricted to a multivariate normal distribution where and a diagonal are functions estimated with a UNet core Ronneberger et al. (2015), as shown in Fig. 2.Using the above assumptions, maximizing the posterior probability can be approximated by minimizing the following loss:
(1) 
where is short for . The first term describes the reconstruction loss and the second term is a KL divergence term, encouraging the estimated posterior probability to be close to the prior . VoxelMorph encourages the smoothness of the velocity field by setting , where the parameter controls the scale of the velocity field and is the graph Laplacian matrix defined on the Euclidean grid. is computed as , where is the neighborhood adjacency matrix and is the graph degree matrix. Thus, Eq. (1) can be rewritten as:
(2)  
where is the number of samples used to approximate the expectation. We use . We treat the fixed atlas and the warped individual image as vectors. We denote this naive application of the registration of 2D projected images as the 2D VoxelMorph method and it serves as a benchmark in our experiments.
Unfortunately, the 2D projection step introduces two main problems, as shown in Fig. 1:

varying level of distortions with different latitudes (distortion increases from the equator to the poles); and

inability to represent the periodic property of and the geometry of the poles (an enclosed spherical surface is projected onto a rectangular image region, introducing discontinuities at the image borders).
Hence, the 2D VoxelMorph method overweights the alignment for nearpole regions, yielding misalignment in most regions even compared to global rigid registration as shown in Section 3.
2.3 Proposed Method: SphereMorph
To address the above issues, we propose SphereMorph. We start by defining the registration problem in the spherical domain. The spherical representation of an individual’s surface is first rotated for a rigid alignment with the atlas, as in FreeSurfer. The spherical surface is parameterized by the longitude and latitude and sampled to an twodimensional image with a geometric or functional feature, e.g. convexity, assigned as a pixel intensity measure.
Prior correction. We assume that the displacement field is smooth on the sphere considering the anatomical continuity via a graph Laplacian regularizer. We define a neighbour connectivity graph on the spherical manifold and represent the velocity with respect to Cartesian coordinates as a signal defined on this graph. Let denote the conversion from polar to Cartesian coordinates, that is , then the geodesic velocity at each vertex is given by .
Each vertex on the projected image is considered as a node and each grid edge connecting two adjacent nodes as their edge in . We connect leftmost and rightmost nodes due to the periodicity in longitude. The weight of the connection between vertices in varies with location to account for the horizontal edge distance on the spherical surface which is proportional to . Thus, we define the weight of each grid edge connecting vertices , as:
(3) 
We construct the corresponding neighborhood adjacency matrix with entries and the degree matrix with diagonal entries . Finally, we denote the Laplacian of this weighted graph as and define the covariance of geodesic velocity as . Intuitively, this formulation can be seen as increasing the regularization near the poles, where the Euclidean distance between mesh nodes is small, and decreasing the weighting near the equator where the Euclidean distances are larger.
Distortion correction. For VoxelMorph, which deals with Euclidean image registration, the sampling of grid points is equally distributed. However, a spherical parameterization leads to denser sampling grids for regions at higher latitudes as shown in Fig. 1. Thus, we assign mesh locations from these regions lower weights in computing the datafitting term, by introducing a diagonal matrix with each diagonal entry encoding the resampling weight for each vertex and model . The first datafitting term in Eq. (1) is then modified as:
(4) 
Loss function. Starting with Eq. (1) and taking into account the spherical geometry, we arrive at the below objective function:
(5)  
where and . The first term in Eq. (5) is the datafitting term, which encourages matching surfaces after warping and the second term drives the posterior to approximate the smoothness prior defined on a spherical grid.
Network structure. Figure 2 illustrates the individual stages of our pipeline. For a given vertex at location , we utilize inverse gnomonic projection, which maps points on the tangent plane to the spherical surface as in SphereNet Coors et al. (2018), to obtain the corresponding locations on the projected image for the neighbor vertex on its tangent plane. We implement the convolution and pooling operations in each local tangent patch shown in Figure 1 and build a UNet core Ronneberger et al. (2015), which contains four downsampling and four upsampling layers. Following the sampling layer, seven scaling and squaring operators take the layer output, or velocity field, and return a diffeomorphism . 2D VoxelMorph uses a dense spatial transformer layer on after displacement to retrieve the warped image while SphereMorph warps the image by computing the interpolation grids as
for transformer layer. The model is implemented in Keras with a Tensorflow backend and the ADAM optimizer as part of the VoxelMorph package. We conducted all the experiments on the same workstation with Intel Xeon X5550@2.67GHz and used NVIDIA Tesla P40C for all CNNbased methods.
3 Experimental setup
To demonstrate the accuracy and efficiency of the proposed registration framework, SphereMorph, we used two sets of experiments, cortical parcellation and fMRI group analysis, on two independent test data sets.
3.1 Data
3.1.1 Training data set:
We used the surface convexity maps of the left hemispheres of 800 randomly selected FreeSurferprocessed subjects from the publicly available Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset Mueller et al. (2005) as training data and the spherical atlas from FreeSurfer as the fixed image in our model. The ADNI dataset consists of longitudinal T1weighted scans from 836 subjects that are divided into four classes: elderly controls (n = 252), early mild cognitive impairment (eMCI, n =215), late MCI (lMCI, n = 176), and AD (n = 193). The subjects were scanned on average 4.8 times (minimum: a single time; maximum: 11 times; 4013 scans in total), with a mean interval between scans equal to 286 days (minimum: 23 days, maximum: 1567 days). The mean age at baseline of the subjects was years. Since the ADNI project spans multiple sites, different scanners were used to acquire the images; further details on the acquisition can be found at http://www.adniinfo.org.
3.1.2 Test data sets:
(1) We used FreeSurferprocessed MRI scans of 39 subjects from a cohort recruited by the Washington University Alzheimer’s Disease Research Center (ADRC) Van Horn et al. (2001). The MRI scans were acquired on a 1.5T Vision system (Siemens, Erlangen Germany). T1weighted magnetizationprepared rapid gradient echo (MPRAGE) scans were obtained according to the following protocol: two sagittal acquisitions, FOV = 224, Matrix = , Resolution = , TR = 9.7 ms, TE = 4 ms, Flip angle = 10, TI = 20 ms, TD = 200 ms. Two acquisitions were averaged together to increase the contrasttonoise ratio. For the cortical parcellation experiment, we separated the data set into two: 9 validation subjects and 30 heldout test subjects. All subjects have 34 cortical areas manually annotated Desikan et al. (2006), making them ideal for evaluating registration accuracy.
(2) Additionally, we used another set of 100 unrelated young and healthy subjects from the Human Connectome Project (HCP) Van Essen et al. (2013) as a second test set. The HCP project used stateoftheart fMRI hardware and acquisition parameters in a sample of highly educated, healthy subjects. For each subject, seven task fMRI sessions were collected, including working memory, gambling, motor, language, social cognition, relational processing and emotional processing, totaling 48:30 of fMRI data. The acquisition parameters and minimal preprocessing of these data have been described extensively elsewhere Glasser et al. (2013); Barch et al. (2013).
3.2 Baselines
We compared our proposed registration method to rigid registration on the sphere (i.e. two rotations) and four other nonlinear registration methods: a 2D version of VoxelMorph, Multimodal Surface Matching (MSM) Robinson et al. (2014), Spherical Demons (SD) Yeo et al. (2010), and FreeSurfer (FS) spherical registration Fischl et al. (1999)
. We trained 2D VoxelMorph using the same training set as described above and selected the hyperparameter
that yielded the best performance on the validation set. MSM is a surfacebased registration approach that offers significant flexibility with regards to the set of features that are used to drive the spatial alignment. MSM drives the deformation via aligning local patches around control points in a multiresolution fashion. It is implemented on CPU using a fast, multiresolution, discrete optimisation scheme, offering significant computational speedup compared to other classical methods. We ran MSM over three resolution levels with five iterations per level. Spherical Demons, a fast diffeomorphic landmarkfree surface registration tool implements the regularization for its objective function via iterative Gaussian smoothing. We explored a range of smoothing iteration numbers (5, 10, 15, 20) to optimize performance and used the results of 10 iterations.3.3 Evaluation
To evaluate registration accuracy, we relied on the resulting spatial transformations to project the atlas parcellation back to individual scan space. For an accurate registration solution, the test subject’s cortical parcellations will resemble the manually outlined versions. In order to quantify how well they match, we computed the Dice overlap coefficient Dice (1945), the overall Mean Minimum Distance (MMD) as well as the individual MMD measures for each anatomical region. The Dice overlap coefficient, , quantifies the surface area overlap () between manual () and automatic methodgenerated parcels () and the MMD describes the discrepancy between the parcellation boundaries: where denotes the Euclidean distance between a vertex on a manual boundary and its corresponding closest vertex on an automatic methodgenerated boundary. Additionally, we also tested how consistently the various registration methods aligned anatomical features, such as convexity/sulcal depth and curvature.
In the second set of experiments, we mapped all individuals within the group to the HCP’s 2mm standard grayordinates space Glasser et al. (2013), using the displacement field, then computed group maps of taskevoked activations. To evaluate the quality of the group alignment quantitatively, we computed average correlations between the groupaverage and the projected individual activation maps across 86 task contrasts derived from the seven fMRI tasks.
4 Results
4.1 Computational Efficiency
Table 1 summarizes registration accuracy and computation time for all methods. It illustrates that the proposed method ran approximately 20 times faster than the conventional registration method in FreeSurfer. On a CPU, the default FreeSurfer pipeline takes around 13 minutes to complete the spherical registration while the total computation time of our proposed framework is approximately 0.65 minutes, including the initial alignment, deep network deployment, and displacement field mapping. Compared with other registration methods including MSM and Spherical Demons (SD), CNNbased methods provide more than an order of magnitude improvement in execution speed.
Method  Dice  overall MMD ()  CPU time ()  GPU time () 

Rigid    
2D VoxelMorph  
MSM    
Spherical Demons    
FreeSurfer    
SphereMorph 
4.2 Cortical Parcellation experiment
4.2.1 Parcellation Accuracy
Figure 3 shows representative cortical segmentation results from all the methods for two test subjects from the ADRC dataset. The 2D VoxelMorphestimated annotation exhibits large differences compared to the ground truth in the lateral occipital regions (marked by white arrows), while FreeSurfer and SphereMorph provide results close to the manual annotations.
Table 1 provides an overview of registration accuracy by comparing manual annotations to parcellations generated by the different registration methods, including rigid alignment, 2D Voxelmorph, MSM, SD, FreeSurfer as well as our proposed method. Our proposed method and FreeSurfer achieved the highest accuracy. SphereMorph yields significantly higher overall Dice coefficients than MSM after performing a onetail Wilcoxon rank sum test on their respective Dice coefficients (
). The deep learning baseline, 2D VoxelMorph, performed significantly worse. This is due to the fact that this method does not account for the distortions intrinsic to the spherical coordinate system. Figure
4 compares parcelwise Dice overlap coefficient values associated with the different registration methods. Our method produced higher mean Dice overlap coefficients than MSM for all structures except the Entorhinal, Paracentral and Middletemporal, and showed significant improvement in regional Dice overlap coefficient compared to MSM in the Temporalpole (), Parahippocampal (), Transversetemporal (), Caudalmiddlefrontal (), Rostralmiddlefrontal () and Lingual () regions. Compared to Spherical Demons, SphereMorph produced higher mean regional Dice coefficients than SD in 26 out of total 34 regions.4.2.2 Group average sulcal maps
To evaluate the performance on a finer scale, we also computed the group mean sulcal depth maps after registration for the test data. The resulting mean and standard deviation maps are displayed in Figure 5. We assume that a better group alignment leads to a sharper group mean and smaller group variation. As expected, the group mean maps provide more detailed information for all nonlinear registration methods than rigid alignment. Moreover, all these methods show smaller standard deviations, suggesting that they provide better alignment in convexity.
All nonlinear registration methods exhibited similar distributions of sulcal variations across brain regions. Specifically, the precentral, postcentral and insula regions show lower standard deviation, suggesting higher agreement of crosssubject convexity in these regions.
4.2.3 Robustness Analysis
We investigated the impact of pole positioning on the registration accuracy for SphereMorph by projecting the sphere using 9 different north pole locations spanning ,
and registering the corresponding 2D planar images with corresponding atlas data. We generated the deformed sphere and then recomputed all the region and distancebased metrics. To evaluate the robustness with respect to different projection centers, we conducted analysis of variance (ANOVA) between all the computed evaluation metrics for these 9 groups. None of the ANOVA analyses (overall Dice:
, overall MMD: ) found any significant differences between the 9 groups of cortical parcellations, indicating that the registration accuracy is not sensitive to the arbitrary location of the poles.4.3 Crosssubject alignment of fMRI activation maps
4.3.1 Group average maps
Figure 6 illustrates the quality of groupwise alignment of convexity after MSM, Spherical Demons, FreeSurfer, and our method on our second test data set. Our method yielded smaller variations within the group than MSM (), just as in the case of the cortical parcellation experiments.
While we expect folding patterns to be predictive of functional areas, this relationship is variable and complex. In order to assess how well the various methods align functionally homologous regions across subjects, we computed group average functional activation maps for all 86 tasks using registration driven by sulcal depth information. Figure 7 compares group activation results from the 100 subjects for the Gambling reward contrast. Our method improves the functional alignment across subjects over MSM, particularly in the inferiorparietal and precuneus regions, indicated by the arrows. To evaluate the group alignment performance quantitatively, we computed the average correlations between the groupaverage and individual activation maps after registration across 86 task contrasts derived from seven tasks. Figure 8 displays average correlations for the different registration methods. For all 86 contrasts, SphereMorph resulted in significantly higher correlation coefficients than MSM (increase of and relative increase of ).
Method  correlation coefficients  difference with MSM  relative difference () 

MSM      
Spherical Demons  
FreeSurfer  
SphereMorph 
4.3.2 Alignment performance analysis using multimodality features
We evaluated the performance for the proposed SphereMorph using curvature and (separately) T1/T2 features to their respective atlases, using our sulcal depth alignment as initialization. We computed a ‘T1/T2’ atlas from three rounds of CNN registration within group. Figure 9 compares the group agreements in task activation for only sulc and two cascade processes using respective curvature and T1/T2 maps. Using T1/T2 as input post sulcal depthbased registration significantly improves the group agreement level () in four MOTOR tasks, including RF, RFAVG, negRF and AVGRF, as indicated by the arrows. All four tasks are related to right finger tapping. Figure 10 displays the group average activation for RF constrast. Using the T1/T2 map leads to a larger region with positive response in the paracentral and superiorfrontal areas (marked by arrows) compared to using sulcal depth as the only input.
5 Discussions and Conclusions
In this paper we present a learningbased method, SphereMorph, for registering cortical surfaces and investigate its performance using two sets of experiments, by comparing it to rigid alignment, 2D VoxelMorph, MSM, Spherical Demons, and FreeSurfer. The proposed SphereMorph yields results that are comparable or superior to the stateoftheart methods for alignment of folding patterns, cortical parcellation and functional alignment, while offering approximately a computational speedup, showing the accuracy and efficiency of our method for cortical registration.
SphereMorph takes 2D parameterized images as input, outputs the deformation field in a 2D canonical space, then warps the sphere in the original Cartesian space. We directly address two issues associated with the 2D projection: the effects of substantial distortions introduced by the parameterization and the violation of continuity at the borders of the 2D plane (i.e. the imposition of spherical topology). To account for the distortion introduced by the parameterization, we modify the data likelihood term and construct the deformation velocity on a graph weighted by the metric tensor of the parameterization. We use the same weights as in existing work by Khasanova and Frossard Khasanova and Frossard (2017) to construct our graph as it has been shown to be capable of encoding the geometry of an omnidirectional camera in the final feature representation of an image. In the implementation of network structure, we encode the neighborhood information by leveraging a spherical kernel defined in SphereNet Coors et al. (2018) instead of a conventional 2D kernel for the network convolution and pooling operations. In addition, we modify the spatial transformer layer to represent the periodicity of longitude in spherical coordinates. Combining these adaptations, our model shows a higher agreement with manual parcellations compared to 2D VoxelMorph, which does not account for distortions and topological changes induced by the 2D parameterization. For 2D VoxelMorph, each point on the rectangular grid contributes equally to the registration, resulting in the alignment of regions near the poles affecting the energy functional more than other regions with the same area in the Euclidean embedding space. Thus, the optimized deformation is overfitted in these regions. Our experimental results confirm this when comparing the automatically generated parcellations to the manual ones. With an excessive weighting of the alignment of these polar regions, 2D VoxelMorph performs even worse than the initial alignment.
Our proposed registration framework is applicable to any signal or feature that can be represented or sampled onto the cortical surface. However, in the cortical parcellation experiment, the ADRC dataset only has structural scans. Hence, we trained our model as well as MSM and SD using convexity values to drive the registrations. Our proposed SphereMorph shows significant smaller group variations and higher overall Dice coefficients when compared to MSM, suggesting SphereMorph achieves better structural alignment than MSM.
In addition to the accuracy of the structural alignment, when using the same feature (i.e. convexity) for registration, both qualitative and quantitative evaluation results for the taskrelated experiments demonstrate that the proposed method also yields higher agreement of functional regions across subjects than the current registration method in the Human Connectome Project (HCP) pipeline Glasser et al. (2013). SphereMorph improves the withingroup correlation coefficients significantly for all tasks compared with MSM. The group agreement in convexity is also higher after SphereMorph registration than with MSM. This implies the relationship between cortical folding patterns and boundaries of functional areas as demonstrated in previous work Hinds et al. (2008). However, using foldingbased features alone may not be sufficient to provide an accurate functional alignment across the entire cortex due to regions with highly variable folding patterns across subjects, as well as structurefunction variability. Thus, in the functional alignment experiments, we take the sulcal depth registration as initialization and then use cortical T1/T2 maps as input of our network for a multistep alignment. The cortical T1/T2 maps are computed based on the ratio of T1weighted to T2weighted images which correlates with many functionally distinct areas in individual subjects Glasser and Van Essen (2011). The combination of T1/T2 and sulcal depth shows significant improvement in group alignment of MOTORrelated tasks, indicating the correlation between T1/T2 and MOTORrelated regions.
Our proposed model can include other contrasts in addition to the convexity metrics used above to drive the registration. Existing studies have shown the improvement in group registration accuracy relying on features generated from restingstate functional MRI (rfMRI) after conventional convexitydriven registration Robinson et al. (2014); Tong et al. (2017). In the future, we will investigate the use of these to further improve the accuracy of our technique.
6 Acknowledgement
Support for this research was provided in part by the BRAIN Initiative Cell Census Network grant U01MH117023, the National Institute for Biomedical Imaging and Bioengineering (P41EB015896, 1R01EB023281, R01EB006758, R21EB018907, R01EB019956, 5R03EB02275402), the National Institute on Aging (1R56AG064027, 5R01AG008122, R01AG016495), the National Institute of Mental Health 5U01MH10958904, the National Institute of Diabetes and Digestive and Kidney Diseases (1R21DK10827701), the National Institute for Neurological Disorders and Stroke (R01NS0525851, R21NS072652, R01NS070963, R01NS083534, 5U01NS086625, 5U24NS10059103, R01NS105820), and was made possible by the resources provided by Shared Instrumentation Grants 1S10RR023401, 1S10RR019307, and 1S10RR023043. Additional support was provided by the NIH Blueprint for Neuroscience Research (5U01MH093765), part of the multiinstitutional Human Connectome Project. In addition, BF has a financial interest in CorticoMetrics, a company whose medical pursuits focus on brain imaging and measurement technologies. BF’s interests were reviewed and are managed by Massachusetts General Hospital and Partners HealthCare in accordance with their conflict of interest policies.
References
 A logEuclidean framework for statistics on diffeomorphisms. In MICCAI, pp. 924–931. Cited by: §1.
 A fast diffeomorphic image registration algorithm. NeuroImage 38 (1), pp. 95–113. Cited by: §1.
 VoxelMorph: a learning framework for deformable medical image registration. IEEE Transactions on Medical Imaging 38, pp. 1788–1800. Cited by: §1.
 Function in the human connectome: taskfMRI and individual differences in behavior. NeuroImage 80, pp. 169–189. Cited by: §3.1.2.
 Spherical cnns. In International Conference on Learning Representations (ICLR), Cited by: §1.
 Intersubject alignment of human cortical anatomy using functional connectivity. NeuroImage 81, pp. 400–411. Cited by: §1.

SphereNet: learning spherical representations for detection and classification in omnidirectional images.
In
Proceedings of the European Conference on Computer Vision (ECCV)
, pp. 518–533. Cited by: §1, §2.3, §5.  Unsupervised learning for fast probabilistic diffeomorphic registration. MICCAI 11070, pp. 729–738. Cited by: §1.
 Unsupervised learning of probabilistic diffeomorphic registration for images and surfaces. Medical Image Analysis 57, pp. 226–236. Cited by: §1, §1, §1, §2.2, §2.
 A deep learning framework for unsupervised affine and deformable image registration. Medical image analysis 52, pp. 128–143. Cited by: §1.
 An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. NeuroImage 31 (3), pp. 968–980. Cited by: §3.1.2.
 Measures of the amount of ecologic association between species. Ecology 26 (3), pp. 297–302. Cited by: §3.3.
 Polar transformer networks. In International Conference on Learning Representations (ICLR), Cited by: §1.
 Highresolution intersubject averaging and a coordinate system for the cortical surface. Human Brain Mapping 8 (4), pp. 272–284. Cited by: §1, §1, §2.1, §2, §3.2.
 Measuring structural–functional correspondence: spatial variability of specialised brain regions after macroanatomical alignment. NeuroImage 59 (2), pp. 1369–1381. Cited by: §1.
 The minimal preprocessing pipelines for the human connectome project. NeuroImage 80, pp. 105–124. Cited by: §3.1.2, §3.3, §5.
 Mapping human cortical areas in vivo based on myelin content as revealed by T1 and T2weighted MRI. Journal of Neuroscience 31 (32), pp. 11597–11616. Cited by: §5.
 Accurate prediction of v1 location from cortical folds in a surface coordinate system. NeuroImage 39 (4), pp. 1585–1599. Cited by: §5.
 Spatial transformer networks. In Advances in Neural Information Processing Systems, pp. 2017–2025. Cited by: §1.
 Back to basics: unsupervised learning of optical flow via brightness constancy and motion smoothness. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–10. Cited by: §1.
 Spherical CNNs on unstructured grids. In International Conference on Learning Representations (ICLR), Cited by: §1.
 Graphbased classification of omnidirectional images. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 869–878. Cited by: §5.
 Learning a probabilistic model for diffeomorphic registration. IEEE Transactions on Medical Imaging 38 (9), pp. 2165–2176. Cited by: §1.
 Robust nonrigid registration through agentbased action learning. In MICCAI, pp. 344–352. Cited by: §1.
 Ways toward an early diagnosis in Alzheimer’s disease: the alzheimer’s disease neuroimaging initiative (ADNI). Alzheimer’s & Dementia 1 (1), pp. 55–66. Cited by: §3.1.1.
 MSM: a new flexible framework for Multimodal Surface Matching. NeuroImage 100, pp. 414–426. Cited by: §1, §3.2, §5.
 Unet: convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computerassisted intervention, pp. 234–241. Cited by: §2.2, §2.3.
 Geometric convolutional neural network for analyzing surfacebased neuroimaging data. Frontiers in Neuroinformatics 12, pp. 42. Cited by: §1.
 Nonrigid image registration using multiscale 3D convolutional neural networks. In MICCAI, pp. 232–239. Cited by: §1.
 Deformable medical image registration: a survey. IEEE Transactions on Medical Imaging 32 (7), pp. 1153–1190. Cited by: §1.
 Learning spherical convolution for fast features from 360 imagery. In Advances in Neural Information Processing Systems (NIPS), pp. 529–539. Cited by: §1.

Equivariant transformer networks.
In
International Conference on Machine Learning (ICML)
, Cited by: §1.  Multicontrast multiscale surface registration for improved alignment of cortical areas. NeuroImage 111, pp. 107–122. Cited by: §1.
 Image matching as a diffusion process: an analogy with maxwell’s demons. Medical Image Analysis 2 (3), pp. 243–260. Cited by: §1.
 Functional density and edge maps: characterizing functional architecture in individuals and improving crosssubject registration. NeuroImage 158, pp. 346–355. Cited by: §5.
 Integration of letters and speech sounds in the human brain. Neuron 43 (2), pp. 271–282. Cited by: §1.
 The WUMinn human connectome project: an overview. NeuroImage 80, pp. 62–79. Cited by: §3.1.2.
 The functional magnetic resonance imaging data center (fMRIDC): the challenges and rewards of large–scale databasing of neuroimaging studies. Philosophical Transactions of the Royal Society of London B: Biological Sciences 356 (1412), pp. 1323–1339. Cited by: §3.1.2.
 Fast predictive image registration. In Deep Learning and Data Labeling for Medical Applications, pp. 48–57. Cited by: §1.
 Spherical demons: fast diffeomorphic landmarkfree surface registration. IEEE Transactions on Medical Imaging 29 (3), pp. 650–668. Cited by: §1, §1, §3.2.
 Frequency diffeomorphisms for efficient image registration. In IPMI, pp. 559–570. Cited by: §1.
Comments
There are no comments yet.