Unlike simple imputation methods such as M, regression, and

Unlike simple imputation methods such as M, regression, and former hot deck that can cause bias and loss of precision (Fang et al., 2009; Little & Rubin, 2002), the s-FCM algorithm accounts for uncertainties generated by imputation. (Due to the arbitrary missing patterns, five imputed datasets [Rubin, 1996] were generated using the Markov Chain Monte Carlo method with multiple chains, noninformative Jeffreys prior of the Bayesian approach, and 500 burn-in iterations [Schafer, 1997]. For each imputed dataset, we then minimized this fuzzy objective function [e.g., we minimized the intracluster variance; Bezdek, 1981; Bezdek, Keller, Krisnapuram, & Pal, 2005; Fang et al., 2011].) Given a termination clustering number (C T) of 22 [C T = (N/2)1/2], where N is the sample size (Bezdek et al.

, 2005), the s-FCM algorithm searched for the optimal number of latent classes through a comprehensive validation procedure, including (a) evaluation of exposure inconsistency rate (ratio of the number of mothers who have inconsistent labels to the total sample size [the larger the value, the less stable the algorithm]); (b) accuracy rate (average rate calculated based on misclassified smokers and nonsmokers across imputed datasets, where we know who are smokers and nonsmokers but not subgroups within smokers). (Compared with typical clustering techniques such as hierarchical and K-means, s-FCM was demonstrated to have the best accuracy and consistency rates across imputed datasets [Fang et al., 2011].

) (c) validation indices (to identify the optimal number of latent classes [groups with like patterns of smoking behavior across pregnancy], s-FCM calculates a set of fuzzy cluster validation indices based on multiple imputed datasets. These validation indices were modified and computed as the average scores across imputed datasets. The Xie-Beni Index [XBmi], widely used for fuzzy clustering, quantifies the ratio of the total variation within and between latent classes, with smaller being better [Xie & Beni, 1991]. The other two indices used were partition coefficient [PCmi] with decreasing monotonicity [the smaller the better] and partition entropy [PEmi] with increasing monotonicity [the larger the better; Bezdek et al., 2005; Fang et al.

, 2011]); (d) graphs Drug_discovery (Sammon [1969] mapping was incorporated into the algorithm to visualize the latent classes in two-dimensional space from multidimensional data, while the functional curves for repeatedly measured smoking variables reflect the intensity of variation over time for each latent exposure class); and (e) statistical testing (the differences between exposure variables [used to characterize smoking during pregnancy] then were examined among identified latent classes to provide quantitative information on tobacco exposure across latent classes over time; Fang et al., 2011).

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>