Bibliography
[1]
N. Adam, and J. Wortman. Security-control methods for statistical databases. ACM Computing Surveys , 21(4), pp. 515–556, 1989.
[2]
G. Adomavicius, and A. Tuzhilin. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering , 17(6), pp. 734–749, 2005.
[3]
R. C. Agarwal, C. C. Aggarwal, and V. V. V. Prasad. A tree projection algorithm for generation of frequent item sets. Journal of parallel and Distributed Computing , 61(3), pp. 350–371, 2001. Also available as IBM Research Report, RC21341, 1999.
[4]
R. C. Agarwal, C. C. Aggarwal, and V. V. V. Prasad. Depth-first generation of long patterns. ACM KDD Conference , pp. 108–118, 2000. Also available as “Depth-first generation of large itemsets for association rules.” IBM Research Report , RC21538, 1999.
[5]
C. Aggarwal. Outlier analysis. Springer , 2013.
[6]
C. Aggarwal. Social network data analytics. Springer , 2011.
[7]
C. Aggarwal, and P. Yu. The igrid index: reversing the dimensionality curse for similarity indexing in high-dimensional space. KDD Conference , pp. 119–129, 2000.
[8]
C. Aggarwal, and P. Yu. On static and dynamic methods for condensation-based privacy-preserving data mining. ACM Transactions on Database Systems (TODS) , 33(1), 2, 2008.
[9]
C. Aggarwal. On unifying privacy and uncertain data models. IEEE International Conference on Data Engineering , pp. 386–395, 2008.
[10]
C. Aggarwal. On k -anonymity and the curse of dimensionality, Very Large Databases Conference , pp. 901–909, 2005.
[11]
C. Aggarwal. On randomization, public information and the curse of dimensionality. IEEE International Conference on Data Engineering , pp. 136–145, 2007.
[12]
C. Aggarwal. Privacy and the dimensionality curse. Privacy-Preserving Data Mining: Models and Algorithms , Springer, pp. 433–460, 2008.
[13]
C. Aggarwal, X. Kong, Q. Gu, J. Han, and P. Yu. Active learning: a survey. Data Classification: Algorithms and Applications , CRC Press, 2014.
[14]
C. Aggarwal. Instance-based learning: A survey. Data Classification: Algorithms and Applications , CRC Press, 2014.
[15]
C. Aggarwal. Redesigning distance-functions and distance-based applications for high-dimensional data. ACM SIGMOD Record , 30(1), pp. 13–18, 2001.
[16]
C. Aggarwal, and P. Yu. Mining associations with the collective strength approach. ACM PODS Conference , pp. 863–873, 1998.
[17]
C. Aggarwal, A. Hinneburg, and D. Keim. On the surprising behavior of distance-metrics in high-dimensional space. ICDT Conference , pp. 420–434, 2001.
[18]
C. Aggarwal. Managing and mining uncertain data. Springer , 2009.
[19]
C. Aggarwal, C. Procopiuc, J. Wolf, P. Yu, and J. Park. Fast algorithms for projected clustering. ACM SIGMOD Conference , pp. 61–72, 1999.
[20]
C. Aggarwal, J. Han, J. Wang, and P. Yu. On demand classification of data streams. ACM KDD Conference , pp. 503–508, 2004.
[21]
C. Aggarwal. On change diagnosis in evolving data streams. IEEE Transactions on Knowledge and Data Engineering , 17(5), pp. 587–600, 2005.
[22]
C. Aggarwal, and P. S. Yu. Finding generalized projected clusters in high dimensional spaces. ACM SIGMOD Conference , pp. 70–81, 2000.
[23]
C. Aggarwal, and S. Parthasarathy. Mining massively incomplete data sets by conceptual reconstruction. ACM KDD Conference , pp. 227–232, 2001.
[24]
C. Aggarwal. Outlier ensembles: position paper. ACM SIGKDD Explorations , 14(2), pp. 49–58, 2012.
[25]
C. Aggarwal. On the effects of dimensionality reduction on high dimensional similarity search. ACM PODS Conference , pp. 256–266, 2001.
[26]
C. Aggarwal, and H. Wang. Managing and mining graph data. Springer , 2010.
[27]
C. Aggarwal, C. Procopiuc, and P. Yu. Finding localized associations in market basket data. IEEE Transactions on Knowledge and Data Engineering , 14(1), pp. 51–62, 2002.
[28]
D. Agrawal, and C. Aggarwal. On the design and quantification of privacy-preserving data mining algorithms. ACM PODS Conference , pp. 247–255, 2001.
[29]
C. Aggarwal, and P. Yu. Privacy-preserving data mining: models and algorithms. Springer , 2008.
[30]
C. Aggarwal. Managing and mining sensor data. Springer , 2013.
[31]
C. Aggarwal, and C. Zhai. Mining text data. Springer , 2012.
[32]
C. Aggarwal, and C. Reddy. Data clustering: algorithms and applications, CRC Press , 2014.
[33]
C. Aggarwal. Data classification: algorithms and applications. CRC Press , 2014.
[34]
C. Aggarwal, and J. Han. Frequent pattern mining. Springer , 2014.
[35]
C. Aggarwal. On biased reservoir sampling in the presence of stream evolution. VLDB Conference , pp. 607–618, 2006.
[36]
C. Aggarwal. A framework for clustering massive-domain data streams. IEEE ICDE Conference , pp. 102–113, 2009.
[37]
C. Aggarwal, and P. Yu. Online generation of association rules. ICDE Conference , pp. 402–411, 1998.
[38]
C. Aggarwal, Z. Sun, and P. Yu. Online generation of profile association rules. ACM KDD Conference , pp. 129–133, 1998.
[39]
C. Aggarwal, J. Han, J. Wang, and P. Yu. A framework for clustering evolving data streams, VLDB Conference , pp. 81–92, 2003.
[40]
C. Aggarwal. Data streams: models and algorithms. Springer , 2007.
[41]
C. Aggarwal, J. Wolf, and P. Yu. A new method for similarity indexing of market basket data. ACM SIGMOD Conference , pp. 407–418, 1999.
[42]
C. Aggarwal, N. Ta, J. Wang, J. Feng, and M. Zaki. Xproj: A framework for projected structural clustering of XML documents. ACM KDD Conference , pp. 46–55, 2007.
[43]
C. Aggarwal. A human-computer interactive method for projected clustering. IEEE Transactions on Knowledge and Data Engineering , 16(4). pp. 448–460. 2004.
[44]
C. Aggarwal, and N. Li. On node classification in dynamic content-based networks. SDM Conference , pp. 355–366, 2011.
[45]
C. Aggarwal, A. Khan, and X. Yan. On flow authority discovery in social networks. SDM Conference , pp. 522–533, 2011.
[46]
C. Aggarwal, and P. Yu. Outlier detection for high dimensional data. ACM SIGMOD Conference , pp. 37–46, 2011.
[47]
C. Aggarwal, and P. Yu. On classification of high-cardinality data streams. SDM Conference , 2010.
[48]
C. Aggarwal, and P. Yu. On clustering massive text and categorical data streams. Knowledge and information systems , 24(2), pp. 171–196, 2010.
[49]
C. Aggarwal, Y. Xie, and P. Yu. On dynamic link inference in heterogeneous networks. SDM Conference , pp. 415–426, 2011.
[50]
C. Aggarwal, Y. Xie, and P. Yu. On dynamic data-driven selection of sensor streams. ACM KDD Conference , pp. 1226–1234, 2011.
[51]
C. Aggarwal. On effective classification of strings with wavelets. ACM KDD Conference , pp. 163–172, 2002.
[52]
C. Aggarwal. On abnormality detection in spuriously populated data streams. SDM Conference , pp. 80–91, 2005.
[53]
R. Agrawal, K.-I. Lin, H. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. VLDB Conference , pp. 490–501, 1995.
[54]
R. Agrawal, and J. Shafer. Parallel mining of association rules. IEEE Transactions on Knowledge and Data Engineering , 8(6), pp. 962–969, 1996. Also appears as IBM Research Report , RJ10004, January 1996.
[55]
R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. ACM SIGMOD Conference , pp. 207–216, 1993.
[56]
R. Agrawal, and R. Srikant. Fast algorithms for mining association rules. VLDB Conference , pp. 487–499, 1994.
[57]
R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. I. Verkamo. Fast discovery of association rules. Advances in knowledge discovery and data mining , 12, pp. 307–328, 1996.
[58]
R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. ACM SIGMOD Conference , pp. 94–105, 1998.
[59]
R. Agrawal, and R. Srikant. Mining sequential patterns. IEEE International Conference on Data Engineering , pp. 3–14, 1995.
[60]
R. Agrawal, and R. Srikant. Privacy-preserving data mining. ACM SIGMOD Conference , pp. 439–450, 2000.
[61]
M. Agyemang, K. Barker, and R. Alhajj. A comprehensive survey of numeric and symbolic outlier mining techniques. Intelligent Data Analysis , 10(6). pp. 521–538, 2006.
[62]
R. Ahuja, T. Magnanti, and J. Orlin. Network flows: theory, algorithms, and applications. Prentice Hall , Englewood Cliffs, New Jersey, 1993.
[63]
M. Al Hasan, and M. J. Zaki. A survey of link prediction in social networks. Social network data analytics , Springer, pp. 243–275, 2011.
[64]
M. Al Hasan, V. Chaoji, S. Salem, and M. Zaki. Link prediction using supervised learning. SDM Workshop on Link Analysis, Counter-terrorism and Security , 2006.
[65]
S. Anand, and B. Mobasher. Intelligent techniques for web personalization. International conference on Intelligent Techniques for Web Personalization , pp. 1–36, 2003.
[66]
F. Angiulli, and C. Pizzuti. Fast Outlier detection in high dimensional spaces. European Conference on Principles of Knowledge Discovery and Data Mining , pp. 15–27, 2002.
[67]
F. Angiulli, and F. Fassetti. Detecting distance-based outliers in streams of data. ACM CIKM Conference , pp. 811–820, 2007.
[68]
L. Akoglu, H. Tong, J. Vreeken, and C. Faloutsos. Fast and reliable anomaly detection in categorical data. ACM CIKM Conference , pp. 415–424, 2012.
[69]
R. Albert, and A. L. Barabasi. Statistical mechanics of complex networks. Reviews of modern physics 74, 1, 47, 2002.
[70]
R. Albert, and A. L. Barabasi. Topology of evolving networks: local events and universality. Physical review letters 85, 24, pp. 5234–5237, 2000.
[71]
P. Allison. Missing data. Sage , 2001.
[72]
N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. ACM PODS Conference , pp. 20–29, 1996.
[73]
S. Altschul, T. Madden, A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. Lipman. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic acids research , 25(17), pp. 3389–3402, 1997.
[74]
M. R. Anderberg. Cluster Analysis for Applications. Academic Press , New York, 1973.
[75]
P. Andritsos, P. Tsaparas, R. J. Miller, and K. C. Sevcik. LIMBO: Scalable clustering of categorical data. EDBT Conference , pp. 123–146, 2004.
[76]
M. Ankerst, M. M. Breunig, H.-P. Kriegel, and J. Sander. OPTICS: ordering points to identify the clustering structure. ACM SIGMOD Conference , pp. 49–60, 1999.
[77]
A. Apostolico, and C. Guerra. The longest common subsequence problem revisited. Algorithmica , 2(1–4), pp. 315–336, 1987.
[78]
A. Azran. The rendezvous algorithm: Multiclass semi-supervised learning with markov random walks. International Conference on Machine Learning , pp. 49–56, 2007.
[79]
A. Banerjee, S. Merugu, I. S. Dhillon, and J. Ghosh. Clustering with Bregman divergences. Journal of Machine Learning Research , 6, pp. 1705–1749, 2005.
[80]
S. Basu, A. Banerjee, and R. J. Mooney. Semi-supervised clustering by seeding. ICML Conference , pp. 27–34, 2002.
[81]
S. Basu, M. Bilenko, and R. J. Mooney. A probabilistic framework for semi-supervised clustering. ACM KDD Conference , pp. 59–68, 2004.
[82]
R. J. Bayardo Jr. Efficiently mining long patterns from databases. ACM SIGMOD , pp. 85–93, 1998.
[83]
R. J. Bayardo, and R. Agrawal. Data privacy through optimal -anonymization. IEEE International Conference on Data Engineering , pp. 217–228, 2005.
[84]
R. Beckman, and R. Cook. Outliers. Technometrics , 25(2), pp. 119–149, 1983.
[85]
A. Ben-Hur, C. S. Ong, S. Sonnenburg, B. Scholkopf, and G. Ratsch. Support vector machines and kernels for computational biology. PLoS computational biology , 4(10), e1000173, 2008.
[86]
M. Benkert, J. Gudmundsson, F. Hubner, and T. Wolle. Reporting flock patterns. COMGEO , 2008
[87]
D. Berndt, and J. Clifford. Using dynamic time warping to find patterns in time series. KDD Workshop , 10(16), pp. 359–370, 1994.
[88]
K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. When is “nearest neighbor” meaningful? International Conference on Database Theory , pp. 217–235, 1999.
[89]
V. Barnett, and T. Lewis. Outliers in statistical data. Wiley, 1994.
[90]
M. Belkin, and P. Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. NIPS , pp. 585–591, 2001.
[91]
M. Bezzi, S. De Capitani di Vimercati, S. Foresti, G. Livraga, P. Samarati, and R. Sassi. Modeling and preventing inferences from sensitive value distributions in data release. Journal of Computer Security , 20(4), pp. 393–436, 2012.
[92]
L. Bergroth, H. Hakonen, and T. Raita. A survey of longest common subsequence algorithms. String Processing and Information Retrieval , 2000.
[93]
S. Bhagat, G. Cormode, and S. Muthukrishnan. Node classification in social networks. Social Network Data Analytics , Springer, pp. 115–148. 2011.
[94]
M. Bilenko, S. Basu, and R. J. Mooney. Integrating constraints and metric learning in semi-supervised clustering. ICML Conference , 2004.
[95]
C. M. Bishop. Pattern recognition and machine learning. Springer , 2007.
[96]
C. M. Bishop. Neural networks for pattern recognition. Oxford University Press , 1995.
[97]
C. M. Bishop. Improving the generalization properties of radial basis function neural networks. Neural Computation , 3(4), pp. 579–588, 1991.
[98]
D. Blei, A. Ng, and M. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research , 3: pp. 993–1022, 2003.
[99]
D. Blei. Probabilistic topic models. Communications of the ACM , 55(4), pp. 77–84, 2012.
[100]
A. Blum, and T. Mitchell. Combining labeled and unlabeled data with co-training. Proceedings of Conference on Computational Learning Theory , 1998.
[101]
A. Blum, and S. Chawla. Combining labeled and unlabeled data with graph mincuts. ICML Conference , 2001.
[102]
C. Bohm, K. Haegler, N. Muller, and C. Plant. Coco: coding cost for parameter free outlier detection. ACM KDD Conference , 2009.
[103]
K. Borgwardt, and H.-P. Kriegel. Shortest-path kernels on graphs. IEEE International Conference on Data Mining , 2005.
[104]
S. Boriah, V. Chandola, and V. Kumar. Similarity measures for categorical data: A comparative evaluation. SIAM Conference on Data Mining , 2008.
[105]
L. Bottou, and V. Vapnik. Local learning algorithms. Neural Computation , 4(6), pp. 888–900, 1992.
[106]
L. Bottou, C. Cortes, J. S. Denker, H. Drucker, I. Guyon, L. Jackel, Y. LeCun, U. A. Müller, E. Säckinger, P. Simard, and V. Vapnik. Comparison of classifier methods: a case study in handwriting digit recognition. International Conference on Pattern Recognition , pp. 77–87, 1994.
[107]
J. Boulicaut, A. Bykowski, and C. Rigotti. Approximation of frequency queries by means of free-sets. Principles of Data Mining and Knowledge Discovery , pp. 75–85, 2000.
[108]
P. Bradley, and U. Fayyad. Refining initial points for k -means clustering. ICML Conference , pp. 91–99, 1998.
[109]
M. Breunig, H.-P. Kriegel, R. Ng, and J. Sander. LOF: Identifying density-based local outliers. ACM SIGMOD Conference , 2000.
[110]
L. Breiman, J. Friedman, C. Stone, and R. Olshen. Classification and regression trees. CRC press , 1984.
[111]
L. Breiman. Random forests. Machine Learning , 45(1), pp. 5–32, 2001.
[112]
L. Breiman. Bagging predictors. Machine Learning , 24(2), pp. 123–140, 1996.
[113]
S. Brin, R. Motwani, and C. Silverstein. Beyond market baskets: generalizing association rules to correlations. ACM SIGMOD Conference , pp. 265–276, 1997.
[114]
S. Brin, and L. Page. The anatomy of a large-scale hypertextual web search engine. Computer Networks , 30(1–7), pp. 107–117, 1998.
[115]
B. Bringmann, S. Nijssen, and A. Zimmermann. Pattern-based classification: A unifying perspective. arXiv preprint, arXiv:1111.6191 , 2011.
[116]
C. Brodley, and P. Utgoff. Multivariate decision trees. Machine learning , 19(1), pp. 45–77, 1995.
[117]
Y. Bu, L. Chen, A. W.-C. Fu, and D. Liu. Efficient anomaly monitoring over moving object trajectory streams. ACM KDD Conference , pp. 159–168, 2009.
[118]
M. Bulmer. Principles of Statistics. Dover Publications , 1979.
[119]
H. Bunke. On a relation between graph edit distance and maximum common subgraph. Pattern Recognition Letters, 18(8), pp. 689–694, 1997.
[120]
H. Bunke, and K. Shearer. A graph distance metric based on the maximal common subgraph. Pattern recognition letters , 19(3), pp. 255–259, 1998.
[121]
W. Buntine. Learning Classification Trees. Artificial intelligence frontiers in statistics . Chapman and Hall, pp. 182–201, 1993.
[122]
T. Burnaby. On a method for character weighting a similarity coefficient employing the concept of information. Mathematical Geology , 2(1), 25–38, 1970.
[123]
D. Burdick, M. Calimlim, and J. Gehrke. MAFIA: A maximal frequent itemset algorithm for transactional databases. IEEE International Conference on Data Engineering , pp. 443–452, 2001.
[124]
C. Burges. A tutorial on support vector machines for pattern recognition. Data mining and knowledge discovery , 2(2), pp. 121–167, 1998.
[125]
T. Calders, and B. Goethals. Mining all non-derivable frequent itemsets. Principles of Knowledge Discovery and Data Mining , pp. 74–86, 2002.
[126]
T. Calders, C. Rigotti, and J. F. Boulicaut. A survey on condensed representations for frequent sets. In Constraint-based mining and inductive databases , pp. 64–80, Springer, 2006.
[127]
S. Chakrabarti. Mining the Web: Discovering knowledge from hypertext data. Morgan Kaufmann , 2003.
[128]
S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks. ACM SIGMOD Conference , pp. 307–318, 1998.
[129]
S. Chakrabarti, S. Sarawagi, and B. Dom. Mining surprising patterns using temporal description length. VLDB Conference , pp. 606–617, 1998.
[130]
K. P. Chan, and A. W. C. Fu. Efficient time series matching by wavelets. IEEE International Conference on Data Engineering , pp. 126–133, 1999.
[131]
V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection: A survey. ACM Computing Surveys , 41(3), 2009.
[132]
V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection for discrete sequences: A survey. IEEE Transactions on Knowledge and Data Engineering , 24(5), pp. 823–839, 2012.
[133]
O. Chapelle. Training a support vector machine in the primal. Neural Computation , 19(5), pp. 1155–1178, 2007.
[134]
C. Chatfield. The analysis of time series: an introduction. CRC Press , 2003.
[135]
A. Chaturvedi, P. Green, and J. D. Carroll. k -modes clustering, Journal of Classification , 18(1), pp. 35–55, 2001.
[136]
N. V. Chawla, N. Japkowicz, and A. Kotcz. Editorial: Special issue on learning from imbalanced data sets. ACM SIGKDD Explorations Newsletter , 6(1), 1–6, 2004.
[137]
N. V. Chawla, K. W. Bower, L. O. Hall, and W. P. Kegelmeyer. SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research (JAIR) , 16, pp. 321–356, 2002.
[138]
N. Chawla, A. Lazarevic, L. Hall, and K. Bowyer. SMOTEBoost: Improving prediction of the minority class in boosting. PKDD , pp. 107–119, 2003.
[139]
N. V. Chawla, D. A. Cieslak, L. O. Hall, and A. Joshi. Automatically countering imbalance and its empirical relationship to cost. Data Mining and Knowledge Discovery , 17(2), pp. 225–252, 2008.
[140]
K. Chen, and L. Liu. A survey of multiplicative perturbation for privacy-preserving data mining. Privacy-Preserving Data Mining: Models and Algorithms , Springer, pp. 157–181, 2008.
[141]
L. Chen, and R. Ng. On the marriage of L p -norms and the edit distance. VLDB Conference , pp. 792–803, 2004.
[142]
W. Chen, Y. Wang, and S. Yang. Efficient influence maximization in social networks. ACM KDD Conference , pp. 199–208, 2009.
[143]
W. Chen, C. Wang, and Y. Wang. Scalable influence maximization for prevalent viral marketing in large-scale social networks. ACM KDD Conference , pp. 1029–1038, 2010.
[144]
W. Chen, Y. Yuan, and L. Zhang. Scalable influence maximization in social networks under the linear threshold model. IEEE International Conference on Data Mining , pp. 88–97, 2010.
[145]
D. Chen, C.-T. Lu, Y. Chen, and D. Kou. On detecting spatial outliers. Geoinformatica , 12: pp. 455–475, 2008.
[146]
T. Cheng, and Z. Li. A hybrid approach to detect spatialtemporal outliers. International Conference on Geoinformatics , pp. 173–178, 2004.
[147]
T. Cheng, and Z. Li. A multiscale approach for spatio-temporal outlier detection. Transactions in GIS, 10(2), pp. 253–263, March 2006.
[148]
Y. Cheng. Mean shift, mode seeking, and clustering. IEEE Transactions on PAMI , 17(8), pp. 790–799, 1995.
[149]
H. Cheng, X. Yan, J. Han, and C. Hsu. Discriminative frequent pattern analysis for effective classification. ICDE Conference , pp. 716–725, 2007.
[150]
F. Y. Chin, and G. Ozsoyoglu. Auditing and inference control in statistical databases. IEEE Transactions on Software Enginerring , 8(6), pp. 113–139, April 1982.
[151]
B. Chiu, E. Keogh, and S. Lonardi. Probabilistic discovery of time series motifs. ACM KDD Conference , pp. 493–498, 2003.
[152]
F. Chung. Spectral Graph Theory. Number 92 in CBMS Conference Series in Mathematics, American Mathematical Society , 1997.
[153]
V. Ciriani, S. De Capitani di Vimercati, S. Foresti, and P. Samarati. k -anonymous data mining: A survey. Privacy-preserving data mining: models and algorithms , Springer, pp. 105–136, 2008.
[154]
C. Clifton, M. Kantarcioglu, J. Vaidya, X. Lin, and M. Y. Zhu. Tools for privacy preserving distributed data mining. ACM SIGKDD Explorations Newsletter , 4(2), pp. 28–34, 2002.
[155]
N. Cristianini, and J. Shawe-Taylor. An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press , 2000.
[156]
W. Cochran. Sampling techniques. John Wiley and Sons , 2007.
[157]
D. Cohn, L. Atlas, and R. Ladner. Improving generalization with active learning. Machine Learning , 5(2), pp. 201–221, 1994.
[158]
D. Cohn, Z. Ghahramani, and M. Jordan. Active learning with statistical models. Journal of Artificial Intelligence Research , 4, pp. 129–145, 1996.
[159]
D. Comaniciu, and P. Meer. Mean shift: A robust approach toward feature space analysis. IEEE Transactions on PAMI , 24(5), pp. 603–619, 2002.
[160]
D. Cook, and L. Holder. Graph-based data mining. IEEE Intelligent Systems , 15(2), pp. 32–41, 2000.
[161]
R. Cooley, B. Mobasher, and J. Srivastava. Data preparation for mining world wide web browsing patterns. Knowledge and information systems , 1(1), pp. 5–32, 1999.
[162]
L. P. Cordella, P. Foggia, C. Sansone, and M. Vento. A (sub)graph isomorphism algorithm for matching large graphs. IEEE Transactions on Pattern Mining and Machine Intelligence , 26(10), pp. 1367–1372, 2004.
[163]
H. Shang, Y. Zhang, X. Lin, and J. X. Yu. Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. Proceedings of the VLDB Endowment , 1(1), pp. 364–375, 2008.
[164]
J. R. Ullmann. An algorithm for subgraph isomorphism. Journal of the ACM , 23: pp. 31–42, January 1976.
[165]
G. Cormode, and S. Muthukrishnan. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms , 55(1), pp. 58–75, 2005.
[166]
S. Cost, and S. Salzberg. A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning , 10(1), pp. 57–78, 1993.
[167]
T. Cover, and P. Hart. Nearest neighbor pattern classification. IEEE Transactions on Information Theory , 13(1), pp. 21–27, 1967.
[168]
D. Cutting, D. Karger, J. Pedersen, and J. Tukey. Scatter/gather: A cluster-based approach to browsing large document collections. ACM SIGIR Conference , pp. 318–329, 1992.
[169]
M. Dash, K. Choi, P. Scheuermann, and H. Liu. Feature selection for clustering-a filter solution. ICDM Conference , pp. 115–122, 2002.
[170]
M. Deshpande, and G. Karypis. Item-based top- n recommendation algorithms. ACM Transactions on Information Systems (TOIS) , 22(1), pp. 143–177, 2004.
[171]
I. Dhillon. Co-clustering documents and words using bipartite spectral graph partitioning, ACM KDD Conference , pp. 269–274, 2001.
[172]
I. Dhillon, S. Mallela, and D. Modha. Information-theoretic co-clustering. ACM KDD Conference , pp. 89–98, 2003.
[173]
I. Dhillon, Y. Guan, and B. Kulis. Kernel -means: spectral clustering and normalized cuts. ACM KDD Conference , pp. 551–556, 2004.
[174]
P. Domingos. MetaCost: A general framework for making classifiers cost-sensitive. ACM KDD Conference , pp. 155–164, 1999.
[175]
P. Domingos. Bayesian averaging of classifiers and the overfitting problem. ICML Conference , pp. 223–230, 2000.
[176]
P. Domingos, and G. Hulten. Mining high-speed data streams. ACM KDD Conference , pp. 71–80. 2000.
[177]
P. Clark, and T. Niblett. The CN2 induction algorithm. Machine Learning , 3(4), pp. 261–283, 1989.
[178]
W. W. Cohen. Fast effectve rule induction. ICML Conference , pp. 115–123, 1995.
[179]
L. H. Cox. Suppression methodology and statistical disclosure control. Journal of the American Statistical Association , 75(370), pp. 377–385, 1980.
[180]
E. Cohen, M. Datar, S. Fujiwara, A. Gionis, P. Indyk, R. Motwani, and C. Yang. Finding interesting associations without support pruning. IEEE Transactions on Knowledge and Data Engineering , 13(1), pp. 64–78, 2001.
[181]
T. Dalenius, and S. Reiss. Data-swapping: A technique for disclosure control. Journal of statistical planning and inference , 6(1), pp. 73–85, 1982.
[182]
G. Das, and H. Mannila. Context-based similarity measures for categorical databases. PKDD Conference , pp. 201–210, 2000.
[183]
B. V. Dasarathy. Nearest neighbor (NN) norms: NN pattern classification techniques. IEEE Computer Society Press , 1990,
[184]
S. Deerwester, S. Dumais, T. Landauer, G. Furnas, and R. Harshman. Indexing by latent semantic analysis. JASIS , 41(6), pp. 391–407, 1990.
[185]
C. Ding, X. He, and H. Simon. On the equivalence of nonnegative matrix factorization and spectral clustering. SDM Conference , pp. 606–610, 2005.
[186]
J. Domingo-Ferrer, and J. M. Mateo-Sanz. Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering , 14(1), pp. 189–201, 2002.
[187]
P. Domingos, and M. Pazzani. On the optimality of the simple bayesian classifier under zero-one loss. Machine Learning , 29(2–3), pp. 103–130, 1997.
[188]
W. Du, and M. Atallah. Secure multi-party computation: A review and open problems. CERIAS Tech. Report , 2001-51, Purdue University, 2001.
[189]
R. Duda, P. Hart, and D. Stork. Pattern classification. John Wiley and Sons , 2012.
[190]
C. Dwork. Differential privacy: A survey of results. Theory and Applications of Models of Computation , Springer, pp. 1–19, 2008.
[191]
C. Dwork. A firm foundation for private data analysis. Communications of the ACM , 54(1), pp. 86–95, 2011.
[192]
D. Easley, and J. Kleinberg. Networks, crowds, and markets: Reasoning about a highly connected world. Cambridge University Press , 2010.
[193]
C. Elkan. The foundations of cost-sensitive learning. IJCAI , pp. 973–978, 2001.
[194]
R. Elmasri, and S. Navathe. Fundamentals of Database Systems . Addison-Wesley, 2010.
[195]
L. Ertoz, M. Steinbach, and V. Kumar. A new shared nearest neighbor clustering algorithm and its applications. Workshop on Clustering High Dimensional Data and its Applications , pp. 105–115, 2002.
[196]
P. Erdos, and A. Renyi. On random graphs. Publicationes Mathematicae Debrecen , 6, pp. 290–297, 1959.
[197]
M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. ACM KDD Conference , pp. 226–231, 1996.
[198]
M. Ester, H. P. Kriegel, J. Sander, M. Wimmer, and X. Xu. Incremental clustering for mining in a data warehousing environment. VLDB Conference , pp. 323–333, 1998.
[199]
S. Even, O. Goldreich, and A. Lempel. A randomized protocol for signing contracts. Communications of the ACM , 28(6), pp. 637–647, 1985.
[200]
A. Evfimievski, R. Srikant, R. Agrawal, and J. Gehrke. Privacy preserving mining of association rules. Information Systems , 29(4), pp. 343–364, 2004.
[201]
M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the internet topology. ACM SIGCOMM Computer Communication Review , pp. 251–262, 1999.
[202]
C. Faloutsos, and K. I. Lin. Fastmap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. ACM SIGMOD Conference , pp. 163–174, 1995.
[203]
W. Fan, S. Stolfo, J. Zhang, and P. Chan. AdaCost: Misclassification cost sensitive boosting. ICML Conference , pp. 97–105, 1999.
[204]
T. Fawcett. ROC Graphs: Notes and Practical Considerations for Researchers. Technical Report HPL-2003-4 , Palo Alto, CA, HP Laboratories, 2003.
[205]
X. Fern, and C. Brodley. Random projection for high dimensional data clustering: A cluster ensemble approach. ICML Conference , pp. 186–193, 2003.
[206]
C. Fiduccia, and R. Mattheyses. A linear-time heuristic for improving network partitions. In IEEE Conference on Design Automation , pp. 175–181, 1982.
[207]
R. Fisher. The use of multiple measurements in taxonomic problems. Annals of Eugenics , 7: pp. 179–188, 1936.
[208]
P. Flajolet, and G. N. Martin. Probabilistic counting algorithms for data base applications. Journal of Computer and System Sciences , 31(2), pp. 182–209, 1985.
[209]
G. W. Flake. Square unit augmented, radially extended, multilayer perceptrons. Neural Networks: Tricks of the Trade , pp. 145–163, 1998.
[210]
F. Fouss, A. Pirotte, J. Renders, and M. Saerens. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Transactions on Knowledge and Data Engineering , 19(3), pp. 355–369, 2007.
[211]
S. Forrest, C. Warrender, and B. Pearlmutter. Detecting intrusions using system calls: alternate data models. IEEE ISRSP , 1999.
[212]
S. Fortunato. Community Detection in Graphs. Physics Reports , 486(3–5), pp. 75–174, February 2010.
[213]
A. Frank, and A. Asuncion. UCI Machine Learning Repository, Irvine, CA: University of California, School of Information and Computer Science, 2010. http://​archive.​ics.​uci.​edu/​ml
[214]
E. Frank, M. Hall, and B. Pfahringer. Locally weighted naive bayes. Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence , pp, 249–256, 2002.
[215]
Y. Freund, and R. Schapire. A decision-theoretic generalization of online learning and application to boosting. Computational Learning Theory , pp. 23–37, 1995.
[216]
J. Friedman. Flexible nearest neighbor classification. Technical Report, Stanford University , 1994.
[217]
J. Friedman, R. Kohavi, and Y. Yun. Lazy decision trees. Proceedings of the National Conference on Artificial Intelligence , pp. 717–724, 1996.
[218]
B. Fung, K. Wang, R. Chen, and P. S. Yu. Privacy-preserving data publishing: A survey of recent developments. ACM Computing Surveys (CSUR) , 42(4), 2010.
[219]
G. Gan, C. Ma, and J. Wu. Data clustering: theory, algorithms, and applications. SIAM , 2007.
[220]
V. Ganti, J. Gehrke, and R. Ramakrishnan. CACTUS: Clustering categorical data using summaries. ACM KDD Conference , pp. 73–83, 1999.
[221]
M. Garey, and D. S. Johnson. Computers and intractability: A guide to the theory of NP-completeness. New York, Freeman , 1979.
[222]
H. Galhardas, D. Florescu, D. Shasha, and E. Simon. AJAX: an extensible data cleaning tool. ACM SIGMOD Conference 29(2), pp. 590, 2000.
[223]
J. Gao, and P.-N. Tan. Converting output scores from outlier detection algorithms into probability estimates. ICDM Conference , pp. 212–221, 2006.
[224]
M. Garofalakis, R. Rastogi, and K. Shim. SPIRIT: Sequential pattern mining with regular expression constraints. VLDB Conference , pp. 7–10, 1999.
[225]
T. Gartner, P. Flach, and S. Wrobel. On graph kernels: Hardness results and efficient alternatives. COLT: Kernel 2003 Workshop Proceedings , pp. 129–143, 2003.
[226]
Y. Ge, H. Xiong, Z.-H. Zhou, H. Ozdemir, J. Yu, and K. Lee. Top-Eye: Top- k evolving trajectory outlier detection. CIKM Conference , pp. 1733–1736, 2010.
[227]
J. Gehrke, V. Ganti, R. Ramakrishnan, and W.-Y. Loh. BOAT: Optimistic decision tree construction. ACM SIGMOD Conference , pp. 169–180, 1999.
[228]
J. Gehrke, R. Ramakrishnan, and V. Ganti. Rainforest-a framework for fast decision tree construction of large datasets. VLDB Conference , pp. 416–427, 1998.
[229]
D. Gibson, J. Kleinberg, and P. Raghavan. Clustering categorical data: an approach based on dynamical systems. The VLDB Journal , 8(3), pp. 222–236, 2000.
[230]
M. Girvan, and M. Newman. Community structure in social and biological networks. Proceedings of the National Academy of Sciences , 99(12), pp. 7821–7826.
[231]
S. Goil, H. Nagesh, and A. Choudhary. MAFIA: Efficient and scalable subspace clustering for very large data sets. ACM KDD Conference , pp. 443–452, 1999.
[232]
D. W. Goodall. A new similarity index based on probability. Biometrics , 22(4), pp. 882–907, 1966.
[233]
K. Gouda, and M. J. Zaki. Genmax: An efficient algorithm for mining maximal frequent itemsets. Data Mining and Knowledge Discovery , 11(3), pp. 223–242, 2005.
[234]
A. Goyal, F. Bonchi, and L. V. S. Lakshmanan. A data-based approach to social influence maximization. VLDB Conference , pp. 73–84, 2011.
[235]
A. Goyal, F. Bonchi, and L. V. S. Lakshmanan. Learning influence probabilities in social networks. ACM WSDM Conference , pp. 241–250, 2011.
[236]
R. Gozalbes, J. P. Doucet, and F. Derouin. Application of topological descriptors in QSAR and drug design: history and new trends. Current Drug Targets-Infectious Disorders , 2(1), pp. 93–102, 2002.
[237]
M. Gupta, J. Gao, C. Aggarwal, and J. Han. Outlier detection for temporal data. Morgan and Claypool, 2014.
[238]
S. Guha, R. Rastogi, and K. Shim. ROCK: A robust clustering algorithm for categorical attributes. Information Systems , 25(5), pp. 345–366, 2000.
[239]
S. Guha, R. Rastogi, and K. Shim. CURE: An efficient clustering algorithm for large databases. ACM SIGMOD Conference , pp. 73–84, 1998.
[240]
S. Guha, A. Meyerson, N. Mishra, R. Motwani, and L. O’Callaghan. Clustering data streams: Theory and practice. IEEE Transactions on Knowledge and Data Engineering , 15(3), pp. 515–528, 2003.
[241]
D. Gunopulos, and G. Das. Time series similarity measures and time series indexing. ACM SIGMOD Conference , pp, 624, 2001.
[242]
V. Guralnik, and G. Karypis. A scalable algorithm for clustering sequential data. IEEE International Conference on Data Engineering , pp. 179–186, 2001.
[243]
V. Guralnik, and G. Karypis. Parallel tree-projection-based sequence mining algorithms. Parallel Computing , 30(4): pp. 443–472, April 2004. Also appears in European Conference in Parallel Processing , 2001.
[244]
D. Gusfield. Algorithms on strings, trees and sequences. Cambridge University Press , 1997.
[245]
I. Guyon (Ed.). Feature extraction: foundations and applications. Springer , 2006.
[246]
I. Guyon, and A. Elisseeff. An introduction to variable and feature selection. Journal of Machine Learning Research , 3, pp. 1157–1182, 2003.
[247]
M. Halkidi, Y. Batistakis, and M. Vazirgiannis. Cluster validity methods: part I. ACM SIGMOD record , 31(2), pp. 40–45, 2002.
[248]
M. Halkidi, Y. Batistakis, and M. Vazirgiannis. Clustering validity checking methods: part II. ACM SIGMOD Record , 31(3), pp. 19–27, 2002.
[249]
E. Han, and G. Karypis. Centroid-based document classification: analysis and experimental results. ECML Conference , pp. 424–431, 2000.
[250]
J. Han, M. Kamber, and J. Pei. Data mining: concepts and techniques. Morgan Kaufmann , 2011.
[251]
J. Han, G. Dong, and Y. Yin. Efficient mining of partial periodic patterns in time series database. International Conference on Data Engineering , pp. 106–115, 1999.
[252]
J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. ACM SIGMOD Conference , pp. 1–12, 2000.
[253]
J. Han, H. Cheng, D. Xin, and X. Yan. Frequent pattern mining: current status and future directions. Data Mining and Knowledge Discovery , 15(1), pp. 55–86, 2007.
[254]
J. Haslett, R. Brandley, P. Craig, A. Unwin, and G. Wills. Dynamic graphics for exploring spatial data with application to locating global and local anomalies. The American Statistician , 45: pp. 234–242, 1991.
[255]
T. Hastie, and R. Tibshirani. Discriminant adaptive nearest neighbor classification. IEEE Transactions on Pattern Analysis and Machine Intelligence , 18(6), pp. 607–616, 1996.
[256]
T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning. Springer , 2009.
[257]
V. Hautamaki, V. Karkkainen, and P. Franti. Outlier detection using -nearest neighbor graph. International Conference on Pattern Recognition , pp. 430–433, 2004.
[258]
T. H. Haveliwala. Topic-sensitive pagerank. World Wide Web Conference , pp. 517-526, 2002.
[259]
D. M. Hawkins. Identification of outliers. Chapman and Hall , 1980.
[260]
S. Haykin. Kalman filtering and neural networks. Wiley , 2001.
[261]
S. Haykin. Neural networks and learning machines. Prentice Hall , 2008.
[262]
X. He, D. Cai, and P. Niyogi. Laplacian score for feature selection. Advances in Neural Information Processing Systems , 18, 507, 2006.
[263]
Z. He, X. Xu, J. Huang, and S. Deng. FP-Outlier: Frequent pattern-based outlier detection. COMSIS , 2(1), pp. 103–118, 2005.
[264]
Z. He, X. Xu, and S. Deng. Discovering cluster-based local outliers, Pattern Recognition Letters , Vol 24(9–10), pp. 1641–1650, 2003.
[265]
M. Henrion, D. Hand, A. Gandy, and D. Mortlock. CASOS: A subspace method for anomaly detection in high-dimensional astronomical databases. Statistical Analysis and Data Mining , 2012. Online first: http://​onlinelibrary.​wiley.​com/​enhanced/​doi/​10.​1002/​sam.​11167/​
[266]
A. Hinneburg, C. Aggarwal, and D. Keim. What is the nearest neighbor in high-dimensional space? VLDB Conference , pp. 506–516, 2000.
[267]
A. Hinneburg, and D. Keim. An efficient approach to clustering in large multimedia databases with noise. ACM KDD Conference , pp. 58–65, 1998.
[268]
A. Hinneburg, D. A. Keim, and M. Wawryniuk. HD-Eye: Visual mining of high-dimensional data. Computer Graphics and Applications , 19(5), pp. 22–31, 1999.
[269]
A. Hinneburg, and H. Gabriel. DENCLUE 2.0: Fast clustering based on kernel-density estimation. Intelligent Data Analysis, Springer , pp. 70–80, 2007.
[270]
D. S. Hirschberg. Algorithms for the longest common subsequence problem. Journal of the ACM (JACM) , 24(4), pp. 664–675, 1975.
[271]
T. Hofmann. Probabilistic latent semantic indexing. ACM SIGIR Conference , pp. 50–57, 1999.
[272]
T. Hofmann. Latent semantic models for collaborative filtering. ACM Transactions on Information Systems (TOIS) , 22(1), pp. 89–114, 2004.
[273]
M. Holsheimer, M. Kersten, H. Mannila, and H. Toivonen. A perspective on databases and data mining, ACM KDD Conference , pp. 150–155, 1995.
[274]
S. Hofmeyr, S. Forrest, and A. Somayaji. Intrusion detection using sequences of system calls. Journal of Computer Security , 6(3), pp. 151–180, 1998.
[275]
D. Hosmer Jr., S. Lemeshow, and R. Sturdivant. Applied logistic regression. Wiley , 2013.
[276]
J. Huan, W. Wang, and J. Prins. Efficient mining of frequent subgraphs in the presence of isomorphism. IEEE ICDM Conference , pp. 549–552, 2003.
[277]
Z. Huang, X. Li, and H. Chen. Link prediction approach to collaborative filtering. ACM/IEEE-CS joint conference on Digital libraries , pp. 141–142, 2005.
[278]
Z. Huang, and M. Ng. A fuzzy k-modes algorithm for clustering categorical data. IEEE Transactions on Fuzzy Systems , 7(4), pp. 446–452, 1999.
[279]
G. Hulten, L. Spencer, and P. Domingos. Mining time-changing data streams. ACM KDD Conference , pp. 97–106, 2001.
[280]
J. W. Hunt, and T. G. Szymanski. A fast algorithm for computing longest common subsequences. Communications of the ACM , 20(5), pp. 350–353, 1977.
[281]
Y. S. Hwang, and S. Y. Bang. An efficient method to construct a radial basis function neural network classifier. Neural Networks , 10(8), pp. 1495–1503, 1997.
[282]
A. Inokuchi, T. Washio, and H. Motoda. An apriori-based algorithm on mining frequent substructures from graph data. Principles on Knowledge Discovery and Data Mining , pp. 13–23, 2000.
[283]
H. V. Jagadish, A. O. Mendelzon, and T. Milo. Similarity-based queries. ACM PODS Conference , pp. 36–45, 1995.
[284]
A. K. Jain, and R. C. Dubes. Algorithms for clustering data. Prentice-Hall, Inc. , 1998.
[285]
A. Jain, M. Murty, and P. Flynn. Data clustering: A review. ACM Computing Surveys (CSUR) , 31(3):264–323, 1999.
[286]
A. Jain, R. Duin, and J. Mao. Statistical pattern recognition: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence,, 22(1), pp. 4–37, 2000.
[287]
V. Janeja, and V. Atluri. Random walks to identify anomalous free-form spatial scan windows. IEEE Transactions on Knowledge and Data Engineering , 20(10), pp. 1378–1392, 2008.
[288]
J. Rennie, and N. Srebro. Fast maximum margin matrix factorization for collaborative prediction. ICML Conference , pp. 713–718, 2005.
[289]
G. Jeh, and J. Widom. SimRank: a measure of structural-context similarity. ACM KDD Conference , pp. 538–543, 2003.
[290]
H. Jeung, M. L. Yiu, X. Zhou, C. Jensen, and H. Shen. Discovery of convoys in trajectory databases. VLDB Conference , pp. 1068–1080, 2008.
[291]
T. Joachims. Making Large scale SVMs practical. Advances in Kernel Methods, Support Vector Learning , pp. 169–184, MIT Press , Cambridge, 1998.
[292]
T. Joachims. Training Linear SVMs in Linear Time. ACM KDD Conference , pp. 217–226, 2006.
[293]
T. Joachims. Transductive inference for text classification using support vector machines. International Conference on Machine Learning , pp. 200–209, 1999.
[294]
T. Joachims. Transductive learning via spectral graph partitioning. ICML Conference , pp. 290–297, 2003.
[295]
I. Jolliffe. Principal component analysis. John Wiley and Sons , 2005.
[296]
M. Joshi, V. Kumar, and R. Agarwal. Evaluating boosting algorithms to classify rare classes: comparison and improvements. IEEE ICDM Conference , pp. 257–264, 2001.
[297]
M. Kantarcioglu. A survey of privacy-preserving methods across horizontally partitioned data. Privacy-Preserving Data Mining: Models and Algorithms , Springer, pp. 313–335, 2008.
[298]
H. Kashima, K. Tsuda, and A. Inokuchi. Kernels for graphs. In Kernel Methods in Computational Biology , MIT Press, Cambridge, MA, 2004.
[299]
D. Karger, and C. Stein. A new approach to the minimum cut problem. Journal of the ACM (JACM) , 43(4), pp. 601–640, 1996.
[300]
G. Karypis, E. H. Han, and V. Kumar. Chameleon: Hierarchical clustering using dynamic modeling. Computer , 32(8), pp, 68–75, 1999.
[301]
G. Karypis, and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on scientific Computing , 20(1), pp. 359–392, 1998.
[302]
G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar. Multilevel hypergraph partitioning: applications in VLSI domain. IEEE Transactions on Very Large Scale Integration (VLSI) Systems , 7(1), pp. 69–79, 1999.
[303]
L. Kaufman, and P. J. Rousseeuw. Finding groups in data: an introduction to cluster analysis. Wiley , 2009.
[304]
D. Kempe, J. Kleinberg, and E. Tardos. Maximizing the spread of influence through a social network. ACM KDD Conference , pp. 137–146, 2003.
[305]
E. Keogh, S. Lonardi, and C. Ratanamahatana. Towards parameter-free data mining. ACM KDD Conference , pp. 206–215, 2004.
[306]
E. Keogh, J. Lin, and A. Fu. HOT SAX: Finding the most unusual time series subsequence: Algorithms and applications. IEEE ICDM Conference , pp. 8, 2005.
[307]
E. Keogh, and M. Pazzani. Scaling up dynamic time-warping for data mining applications. ACM KDD Conference , pp. 285–289, 2000.
[308]
E. Keogh. Exact indexing of dynamic time warping. VLDB Conference , pp. 406–417, 2002.
[309]
E. Keogh, K. Chakrabarti, M. Pazzani, and S. Mehrotra. Dimensionality reduction for fast similarity searching in large time series datanases. Knowledge and Infomration Systems , pp. 263–286, 2000.
[310]
E. Keogh, S. Lonardi, and B. Y.-C. Chiu. Finding surprising patterns in a time series database in linear time and space. ACM KDD Conference , pp. 550–556, 2002.
[311]
E. Keogh, S. Lonardi, and C. Ratanamahatana. Towards parameter-free data mining. ACM KDD Conference , pp. 206–215, 2004.
[312]
B. Kernighan, and S. Lin. An efficient heuristic procedure for partitioning graphs. Bell System Technical Journal , 1970.
[313]
A. Khan, N. Li, X. Yan, Z. Guan, S. Chakraborty, and S. Tao. Neighborhood-based fast graph search in large networks. ACM SIGMOD Conference, pp. 901–912, 2011.
[314]
A. Khan, Y. Wu, C. Aggarwal, and X. Yan. Nema: Fast graph matching with label similarity. Proceedings of the VLDB Endowment , 6(3), pp. 181–192, 2013.
[315]
D. Kifer, and J. Gehrke. Injecting utility into anonymized datasets. ACM SIGMOD Conference , pp. 217–228, 2006.
[316]
L. Kissner, and D. Song. Privacy-preserving set operations. Advances in Cryptology–CRYPTO , pp. 241–257, 2005.
[317]
J. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM) , 46(5), pp. 604–632, 1999.
[318]
S. Knerr, L. Personnaz, and G. Dreyfus. Single-layer learning revisited: a stepwise procedure for building and training a neural network. In J. Fogelman, editor, Neurocomputing: Algorithms, Architectures and Applications. Springer-Verlag, 1990.
[319]
E. Knorr, and R. Ng. Algorithms for mining distance-based outliers in large datasets. VLDB Conference , pp. 392–403, 1998.
[320]
E. Knorr, and R. Ng. Finding intensional knowledge of distance-based outliers. VLDB Conference , pp. 211–222, 1999.
[321]
Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer , 42(8), pp. 30–37, 2009.
[322]
Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. ACM KDD Conference , pp. 426–434, 2008.
[323]
Y. Koren. Collaborative filtering with temporal dynamics. Communications of the ACM,, 53(4), pp. 89–97, 2010.
[324]
D. Kostakos, G. Trajcevski, D. Gunopulos, and C. Aggarwal. Time series data clustering. Data Clustering: Algorithms and Applications , CRC Press, 2013.
[325]
J. Konstan. Introduction to recommender systems: algorithms and evaluation. ACM Transactions on Information Systems , 22(1), pp. 1–4, 2004.
[326]
Y. Kou, C. T. Lu, and D. Chen. Spatial weighted outlier detection, SIAM Conference on Data Mining , 2006.
[327]
A. Krogh, M. Brown, I. Mian, K. Sjolander, and D. Haussler. Hidden Markov models in computational biology: Applications to protein modeling. Journal of molecular biology , 235(5), pp. 1501–1531, 1994.
[328]
J. B. Kruskal. Nonmetric multidimensional scaling: a numerical method. Psychometrika , 29(2), pp. 115–129, 1964.
[329]
B. Kulis, S. Basu, I. Dhillon, and R. Mooney. Semi-supervised graph clustering: a kernel approach. Machine Learning , 74(1), pp. 1–22, 2009.
[330]
S. Kulkarni, G. Lugosi, and S. Venkatesh. Learning pattern classification: a survey. IEEE Transactions on Information Theory , 44(6), pp. 2178–2206, 1998.
[331]
M. Kuramochi, and G. Karypis. Frequent subgraph discovery. IEEE International Conference on Data Mining , pp. 313–320, 2001.
[332]
L. V. S. Lakshmanan, R. Ng, J. Han, and A. Pang. Optimization of constrained frequent set queries with 2-variable constraints. ACM SIGMOD Conference , pp. 157–168, 1999.
[333]
P. Langley, W. Iba, and K. Thompson. An analysis of Bayesian classifiers. Proceedings of the National Conference on Artificial Intelligence , pp. 223–228, 1992.
[334]
A. Lazarevic, and V. Kumar. Feature bagging for outlier detection. ACM KDD Conference , pp. 157–166, 2005.
[335]
K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Incognito: Efficient full-domain k-anonymity. ACM SIGMOD Conference , pp. 49–60, 2005.
[336]
K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Mondrian multidimensional k -anonymity. IEEE International Conference on Data Engineering , pp. 25, 2006.
[337]
J.-G. Lee, J. Han, and X. Li. Trajectory outlier detection: A partition-and-detect framework. ICDE Conference , pp. 140–149, 2008.
[338]
J.-G. Lee, J. Han, and K.-Y. Whang. Trajectory clustering: a partition-and-group framework. ACM SIGMOD Conference , pp. 593–604, 2007.
[339]
J.-G. Lee, J. Han, X. Li, and H. Gonzalez. TraClass: trajectory classification using hierarchical region-based and trajectory-based clustering. Proceedings of the VLDB Endowment , 1(1), pp. 1081–1094, 2008.
[340]
W. Lee, and D. Xiang. Information theoretic measures for anomaly detection. IEEE Symposium on Security and Privacy , pp. 130–143, 2001.
[341]
J. Leskovec, D. Huttenlocher, and J. Kleinberg. Predicting positive and negative links in online social networks. World Wide Web Conference , pp. 641–650, 2010.
[342]
J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over time: densification laws, shrinking diameters, and possible explanations. ACM KDD Conference , pp. 177–187, 2005.
[343]
J. Leskovec, A. Rajaraman, and J. Ullman. Mining of massive datasets. Cambridge University Press , 2012.
[344]
D. Lewis. Naive Bayes at forty: The independence assumption in information retrieval. ECML Conference , pp. 4–15, 1998.
[345]
D. Lewis, and J. Catlett. Heterogeneous uncertainty sampling for supervised learning. ICML Conference , pp. 148–156, 1994.
[346]
C. Li, Q. Yang, J. Wang, and M. Li. Efficient mining of gap-constrained subsequences and its various applications. ACM Transactions on Knowledge Discovery from Data (TKDD) , 6(1), 2, 2012.
[347]
J. Li, G. Dong, K. Ramamohanarao, and L. Wong. Deeps: A new instance-based lazy discovery and classification system. Machine Learning , 54(2), pp. 99–124, 2004.
[348]
N. Li, T. Li, and S. Venkatasubramanian. t-closeness: Privacy beyond -anonymity and -diversity. IEEE International Conference on Data Engineering , pp. 106–115, 2007.
[349]
W. Li, J. Han, and J. Pei. CMAR: Accurate and efficient classification based on multiple class-association rules. IEEE ICDM Conference , pp. 369–376, 2001.
[350]
Y. Li, M. Dong, and J. Hua. Localized feature selection for clustering. Pattern Recognition Letters , 29(1), 10–18, 2008.
[351]
Z. Li, B. Ding, J. Han, and R. Kays. Swarm: Mining relaxed temporal moving object clusters. Proceedings of the VLDB Endowment , 3(1–2), pp. 732–734, 2010.
[352]
Z. Li, B. Ding, J. Han, R. Kays, and P. Nye. Mining periodic behaviors for moving objects. ACM KDD Conference , pp. 1099–1108, 2010.
[353]
D. Liben-Nowell, and J. Kleinberg. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology , 58(7), pp. 1019–1031, 2007.
[354]
R. Lichtenwalter, J. Lussier, and N. Chawla. New perspectives and methods in link prediction. ACM KDD Conference , pp. 243–252, 2010.
[355]
J. Lin, E. Keogh, S. Lonardi, and B. Chiu. Experiencing SAX: a novel symbolic representation of time series. Data Mining and Knowledge Discovery , 15(2), pp. 107–144, 2003.
[356]
J. Lin, E. Keogh, S. Lonardi, and P. Patel. Finding motifs in time series. Proceedings of the 2nd Workshop on Temporal Data , 2002.
[357]
B. Liu. Web data mining: exploring hyperlinks, contents, and usage data. Springer, New York, 2007.
[358]
B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule mining. ACM KDD Conference , pp. 80–86, 1998.
[359]
G. Liu, H. Lu, W. Lou, and J. X. Yu. On computing, storing and querying frequent patterns. ACM KDD Conference , pp. 607–612, 2003.
[360]
H. Liu, and H. Motoda. Feature selection for knowledge discovery and data mining. Springer , 1998.
[361]
J. Liu, Y. Pan, K. Wang, and J. Han. Mining frequent item sets by opportunistic projection. ACM KDD Conference , pp. 229–238, 2002.
[362]
L. Liu, J. Tang, J. Han, M. Jiang, and S. Yang. Mining topic-level influence in heterogeneous networks. ACM CIKM Conference , pp. 199–208, 2010.
[363]
D. Lin. An Information-theoretic Definition of Similarity. ICML Conference , pp. 296–304, 1998.
[364]
R. Little, and D. Rubin. Statistical analysis with missing data. Wiley , 2002.
[365]
F. T. Liu, K. M. Ting, and Z.-H. Zhou. Isolation forest. IEEE ICDM Conference , pp. 413–422, 2008.
[366]
H. Liu, and H. Motoda. Computational methods of feature selection. Chapman and Hall/CRC , 2007.
[367]
K. Liu, C. Giannella, and H. Kargupta. A survey of attack techniques on privacy-preserving data perturbation methods. Privacy-Preserving Data Mining: Models and Algorithms , Springer, pp. 359–381, 2008.
[368]
B. London, and L. Getoor. Collective classification of network data. Data Classification: Algorithms and Applications , CRC Press, pp. 399–416, 2014.
[369]
C.-T. Lu, D. Chen, and Y. Kou. Algorithms for spatial outlier detection, IEEE ICDM Conference , pp. 597–600, 2003.
[370]
Q. Lu, and L. Getoor. Link-based classification. ICML Conference , pp. 496–503, 2003.
[371]
U. von Luxburg. A tutorial on spectral clustering. Statistics and computing , 17(4), pp. 395–416, 2007.
[372]
A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam. ℓ-diversity: privacy beyond k -anonymity. ACM Transactions on Knowledge Discovery from Data (TKDD) , 1(3), 2007.
[373]
S. Macskassy, and F. Provost. A simple relational classifier. Second Workshop on Multi-Relational Data Mining (MRDM) at ACM KDD Conference , 2003.
[374]
S. C. Madeira, and A. L. Oliveira. Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 1(1), pp. 24–45, 2004.
[375]
N. Mamoulis, H. Cao, G. Kollios, M. Hadjieleftheriou, Y. Tao, and D. Cheung. Mining, indexing, and querying historical spatiotemporal data. ACM KDD Conference , pp. 236–245, 2004.
[376]
G. Manku, and R. Motwani. Approximate frequency counts over data streams. VLDB Conference , pp. 346–357, 2002.
[377]
C. Manning, P. Raghavan, and H. Schutze. Introduction to information retrieval. Cambridge University Press , Cambridge, 2008.
[378]
M. Markou, and S. Singh. Novelty detection: a review, part 1: statistical approaches. Signal Processing , 83(12), pp. 2481–2497, 2003.
[379]
G. J. McLachian. Discriminant analysis and statistical pattern recognition. Wiler Interscience , 2004.
[380]
M. Markou, and S. Singh. Novelty detection: A review, part 2: neural network-based approaches. Signal Processing , 83(12), pp. 2481–2497, 2003.
[381]
M. Mehta, R. Agrawal, and J. Rissanen. SLIQ: A fast scalable classifier for data mining, EDBT Conference , pp. 18–32, 1996.
[382]
P. Melville, M. Saar-Tsechansky, F. Provost, and R. Mooney. An expected utility approach to active feature-value acquisition. IEEE ICDM Conference , 2005.
[383]
A. K. Menon, and C. Elkan. Link prediction via matrix factorization. Machine Learning and Knowledge Discovery in Databases , pp. 437–452, 2011.
[384]
B. Messmer, and H. Bunke. A new algorithm for error-tolerant subgraph isomprohism detection. IEEE Transactions on Pattern Mining and Machine Intelligence , 20(5), pp. 493–504, 1998.
[385]
A. Meyerson, and R. Williams. On the complexity of optimal k -anonymization. ACM PODS Conference , pp. 223–228, 2004.
[386]
R. Michalski, I. Mozetic, J. Hong, and N. Lavrac. The multi-purpose incremental learning system AQ15 and its testing application to three medical domains. Proceedings of the AAAI , pp. 1–41, 1986.
[387]
C. Michael, and A. Ghosh. Two state-based approaches to program-based anomaly detection. Computer Security Applications Conference , pp. 21, 2000.
[388]
H. Miller, and J. Han. Geographic data mining and knowledge discovery. CRC Press , 2009.
[389]
T. M. Mitchell. Machine learning. McGraw Hill International Edition , 1997.
[390]
B. Mobasher. Web usage mining and personalization. Practical Handbook of Internet Computing, ed. Munindar Singh , pp, 264–265, CRC Press, 2005.
[391]
D. Montgomery, E. Peck, and G. Vining. Introduction to linear regression analysis. John Wiley and Sons , 2012.
[392]
C. H. Mooney, and J. F. Roddick. Sequential pattern mining: approaches and algorithms. ACM Computing Surveys (CSUR) , 45(2), 2013.
[393]
B. Moret. Decision trees and diagrams. ACM Computing Surveys (CSUR) , 14(4), pp. 593–623, 1982.
[394]
A. Mueen, E. Keogh, Q. Zhu, S. Cash, and M. Westover. Exact discovery of time series motifs. SDM Conference , pp. 473–484, 2009.
[395]
A. Mueen, and E. Keogh. Online discovery and maintenance of time series motifs. ACM KDD Conference , pp. 1089–1098, 2010.
[396]
E. Muller, M. Schiffer, and T. Seidl. Statistical selection of relevant subspace projections for outlier ranking. ICDE Conference , pp, 434–445, 2011.
[397]
E. Muller, I. Assent, P. Iglesias, Y. Mulle, and K. Bohm. Outlier analysis via subspace analysis in multiple views of the data. IEEE ICDM Conference , pp. 529–538, 2012.
[398]
S. K. Murthy. Automatic construction of decision trees from data: A multi-disciplinary survey. Data Mining and Knowledge Discovery , 2(4), pp. 345–389, 1998.
[399]
S. Nabar, K. Kenthapadi, N. Mishra, and R. Motwani. A survey of query auditing techniques for data privacy. Privacy-Preserving Data Mining: Models and Algorithms , Springer, pp. 415–431, 2008.
[400]
D. Nadeau, and S. Sekine. A survey of named entity recognition and classification. Lingvisticae Investigationes , 30(1), 3–26, 2007.
[401]
M. Naor, and B. Pinkas. Efficient oblivious transfer protocols. SODA Conference , pp. 448–457, 2001.
[402]
A. Narayanan, and V. Shmatikov. How to break anonymity of the netflix prize dataset. arXiv preprint cs/0610105 , 2006. http://​arxiv.​org/​abs/​cs/​0610105
[403]
G. Nemhauser, and L. Wolsey. Integer and combinatorial optimization. Wiley , New York, 1988.
[404]
J. Neville, and D. Jensen. Iterative classification in relational data. AAAI Workshop on Learning Statistical Models from Relational Data , pp. 13–20, 2000.
[405]
A. Ng, M. Jordan, and Y. Weiss. On spectral clustering analysis and an algorithm. Advances in Neural Information Processing Systems , pp. 849–856, 2001.
[406]
R. T. Ng, L. V. S. Lakshmanan, J. Han, and A. Pang. Exploratory mining and pruning optimizations of constrained associations rules. ACM SIGMOD Conference , pp. 13–24, 1998.
[407]
R. T. Ng, and J. Han. CLARANS: A method for clustering objects for spatial data mining. IEEE Transactions on Knowledge and Data Engineering , 14(5), pp. 1003–1016, 2002.
[408]
M. Neuhaus, and H. Bunke. Automatic learning of cost functions for graph edit distance. Information Sciences , 177(1), pp. 239–247, 2007.
[409]
M. Neuhaus, K. Riesen, and H. Bunke. Fast suboptimal algorithms for the computation of graph edit distance. Structural, Syntactic, and Statistical Pattern Recognition , pp. 163–172, 2006.
[410]
K. Nigam, A. McCallum, S. Thrun, and T. Mitchell. Text classification with labeled and unlabeled data using EM. Machine Learning , 39(2), pp. 103–134, 2000.
[411]
B. Ozden, S. Ramaswamy, and A. Silberschatz. Cyclic association rules. International Conference on Data Engineering , pp. 412–421, 1998.
[412]
L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation engine: Bringing order to the web. Technical Report , 1999–0120, Computer Science Department, Stanford University, 1998.
[413]
F. Pan, G. Cong, A. Tung, J. Yang, and M. Zaki. CARPENTER: Finding closed patterns in long biological datasets. ACM KDD Conference , pp. 637–642, 2003.
[414]
T. Palpanas. Real-time data analytics in sensor networks. Managing and Mining Sensor Data , pp. 173–210, Springer, 2013.
[415]
F. Pan, A. K. H. Tung, G. Cong, and X. Xu. COBBLER: Combining column and row enumeration for closed pattern discovery. International Conference on Scientific and Statistical Database Management , pp. 21–30, 2004.
[416]
C. Papadimitriou, H. Tamaki, P. Raghavan, and S. Vempala. Latent semantic indexing: A probabilistic analysis. ACM PODS Conference , pp. 159–168, 1998.
[417]
N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Discovering frequent closed itemsets for association rules. International Conference on Database Theory , pp. 398–416, 1999.
[418]
P. Patel, E. Keogh, J. Lin, and S. Lonardi. Mining motifs in massive time series databases. IEEE ICDM Conference , pp. 370–377, 2002.
[419]
J. Pei, J. Han, H. Lu, S. Nishio, S. Tang, and D. Yang. H-mine: Hyper-structure mining of frequent patterns in large databases. IEEE ICDM Conference , pp. 441–448, 2001.
[420]
J. Pei, J. Han, and R. Mao. CLOSET: An efficient algorithm for mining frequent closed itemsets. ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery , pp, 21–30, 2000.
[421]
J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M. C. Hsu. Mining sequential patterns by pattern-growth: The prefixspan approach. IEEE Transactions on Knowledge and Data Engineering , 16(11), pp. 1424–1440, 2004.
[422]
J. Pei, J. Han, and L. V. S. Lakshmanan. Mining frequent patterns with convertible constraints. ICDE Conference , pp. 433–442, 2001.
[423]
D. Pelleg, and A. W. Moore. X-means: Extending k -means with efficient estimation of the number of clusters. ICML Conference, pp. 727–734, 2000.
[424]
M. Petrou, and C. Petrou. Image processing: the fundamentals. Wiley , 2010.
[425]
D. Pierrakos, G. Paliouras, C. Papatheodorou, and C. Spyropoulos. Web usage mining as a tool for personalization: a survey. User Modeling and User-Adapted Interaction , 13(4), pp, 311–372, 2003.
[426]
D. Pokrajac, A. Lazerevic, and L. Latecki. Incremental local outlier detection for data streams. Computational Intelligence and Data Mining Conference , pp. 504–515, 2007.
[427]
S. A. Macskassy, and F. Provost. Classification in networked data: A toolkit and a univariate case study. Joirnal of Machine Learning Research , 8, pp. 935–983, 2007.
[428]
G. Qi, C. Aggarwal, and T. Huang. Link Prediction across networks by biased cross-network sampling. IEEE ICDE Conference , pp. 793–804, 2013.
[429]
G. Qi, C. Aggarwak, and T. Huang. Online community detection in social sensing. ACM WSDM Conference , pp. 617–626, 2013.
[430]
J. Quinlan. C4.5: programs for machine learning. Morgan-Kaufmann Publishers , 1993.
[431]
J. Quinlan. Induction of decision trees. Machine Learning , 1, pp. 81–106, 1986.
[432]
D. Rafiei, and A. Mendelzon. Similarity-based queries for time series data, ACM SIGMOD Record , 26(2), pp. 13–25, 1997.
[433]
E. Rahm, and H. Do. Data cleaning: problems and current approaches, IEEE Data Engineering Bulletin , 23(4), pp. 3–13, 2000.
[434]
R. Ramakrishnan, and J. Gehrke. Database Management Systems. Osborne/McGraw Hill , 1990.
[435]
V. Raman, and J. Hellerstein. Potter’s wheel: An interactive data cleaning system. VLDB Conference , pp. 381–390, 2001.
[436]
S. Ramaswamy, R. Rastogi, and K. Shim. Efficient algorithms for mining outliers from large data sets. ACM SIGMOD Conference , pp. 427–438, 2000.
[437]
M. Rege, M. Dong, and F. Fotouhi. Co-clustering documents and words using bipartite isoperimetric graph partitioning. IEEE ICDM Conference , pp. 532–541, 2006.
[438]
E. S. Ristad, and P. N. Yianilos. Learning string-edit distance. IEEE Transactions on Pattern Analysis and Machine Intelligence . 20(5), pp. 522–532, 1998.
[439]
F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological review , 65(6), 286, 1958.
[440]
R. Salakhutdinov, and A. Mnih. Probabilistic Matrix Factorization. Advances in Neural and Information Processing Systems , pp. 1257–1264, 2007.
[441]
G. Salton, and M. J. McGill. Introduction to modern information retrieval. McGraw Hill , 1986.
[442]
P. Samarati. Protecting respondents identities in microdata release. IEEE Transactions on Knowledge and Data Engineering , 13(6), pp. 1010–1027, 2001.
[443]
H. Samet. The design and analysis of spatial data structures. Addison-Wesley , Reading, MA, 1990.
[444]
J. Sander, M. Ester, H. P. Kriegel, and X. Xu. Density-based clustering in spatial databases: The algorithm gdbscan and its applications. Data Mining and Knowledge Discovery , 2(2), pp. 169–194, 1998.
[445]
B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Item-based collaborative filtering recommendation algorithms. World Wide Web Conference , pp. 285–295, 2001.
[446]
A. Savasere, E. Omiecinski, and S. B. Navathe. An efficient algorithm for mining association rules in large databases. Very Large Databases Conference , pp. 432–444, 1995.
[447]
A. Savasere, E. Omiecinski, and S. Navathe. Mining for strong negative associations in a large database of customer transactions. IEEE ICDE Conference , pp. 494–502, 1998.
[448]
C. Saunders, A. Gammerman, and V. Vovk. Ridge regression learning algorithm in dual variables. ICML Conference , pp. 515–521, 1998.
[449]
B. Scholkopf, and A. J. Smola. Learning with kernels: support vector machines, regularization, optimization, and beyond. Cambridge University Press , 2001.
[450]
B. Scholkopf, A. Smola, and K.-R. Muller. Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation , 10(5), pp. 1299–1319, 1998.
[451]
B. Scholkopf, and A. J. Smola. Learning with Kernels . MIT Press, Cambridge, MA, 2002.
[452]
H. Schutze, and C. Silverstein. Projections for efficient document clustering. ACM SIGIR Conference , pp. 74–81, 1997.
[453]
F. Sebastiani. Machine Learning in Automated Text Categorization. ACM Computing Surveys , 34(1), 2002.
[454]
B. Settles. Active Learning. Morgan and Claypool , 2012.
[455]
B. Settles, and M. Craven. An analysis of active learning strategies for sequence labeling tasks. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) , pp. 1069–1078, 2008.
[456]
D. Seung, and L. Lee. Algorithms for non-negative matrix factorization. Advances in Neural Information Processing Systems , 13, pp. 556–562, 2001.
[457]
H. Seung, M. Opper, and H. Sompolinsky. Query by committee. Fifth annual workshop on Computational learning theory , pp. 287–294, 1992.
[458]
J. Shafer, R. Agrawal, and M. Mehta. SPRINT: A scalable parallel classifier for data mining. VLDB Conference , pp. 544–555, 1996.
[459]
S. Shekhar, C. T. Lu, and P. Zhang. Detecting graph-based spatial outliers: algorithms and applications. ACM KDD Conference , pp. 371–376, 2001.
[460]
S.Shekhar, C. T. Lu, and P. Zhang. A unified approach to detecting spatial outliers. Geoinformatica , 7(2), pp. 139–166, 2003.
[461]
S. Shekhar, and S. Chawla. A tour of spatial databases. Prentice Hall , 2002.
[462]
S. Shekhar, C. T. Lu, and P. Zhang. Detecting graph-based spatial outliers. Intelligent Data Analysis , 6, pp. 451–468, 2002.
[463]
S. Shekhar, and Y. Huang. Discovering spatial co-location patterns: a summary of results. In Advances in Spatial and Temporal Databases , pp. 236–256, Springer, 2001.
[464]
G. Sheikholeslami, S. Chatterjee, and A. Zhang. Wavecluster: A multi-resolution clustering approach for very large spatial databases. VLDB Conference , pp. 428–439, 1998.
[465]
P. Shenoy, J. Haritsa, S. Sudarshan, G., Bhalotia, M. Bawa, and D. Shah. Turbo-charging vertical mining of large databases. ACM SIGMOD Conference , 29(2), pp. 22–35, 2000.
[466]
J. Shi, and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 22(8), pp. 888–905, 2000.
[467]
R. Shumway, and D. Stoffer. Time-series analysis and its applications: With R examples, Springer , New York, 2011.
[468]
M.-L. Shyu, S.-C. Chen, K. Sarinnapakorn, and L. Chang. A novel anomaly detection scheme based on principal component classifier, ICDM Conference , pp. 353–365, 2003.
[469]
R. Sibson. SLINK: An optimally efficient algorithm for the single-link clustering method. The Computer Journal, 16(1), pp. 30–34, 1973.
[470]
A. Siebes, J. Vreeken, and M. van Leeuwen. itemsets that compress. SDM Conference , pp. 393–404, 2006.
[471]
B. W. Silverman. Density Estimation for Statistics and Data Analysis. Chapman and Hall , 1986.
[472]
K. Smets, and J. Vreeken. The odd one out: Identifying and characterising anomalies. SIAM Conference on Data Mining , pp. 804–815, 2011.
[473]
E. S. Smirnov. On exact methods in systematics. Systematic Zoology , 17(1), pp. 1–13, 1968.
[474]
P. Smyth. Clustering sequences with hidden Markov models. Advances in Neural Information Processing Systems , pp. 648–654, 1997.
[475]
E. J. Stollnitz, and T. D. De Rose. Wavelets for computer graphics: theory and applications. Morgan Kaufmann , 1996.
[476]
R. Srikant, and R. Agrawal. Mining quantitative association rules in large relational tables. ACM SIGMOD Conference , pp. 1–12, 1996.
[477]
J. Srivastava, R. Cooley, M. Deshpande, and P. N. Tan. Web usage mining: Discovery and applications of usage patterns from web data. ACM SIGKDD Explorations Newsletter , 1(2), pp. 12–23, 2000.
[478]
I. Steinwart, and A. Christmann. Support vector machines. Springer , 2008.
[479]
A. Strehl, and J. Ghosh. Cluster ensembles—a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research , 3, pp. 583–617, 2003.
[480]
G. Strang. An introduction to linear algebra. Wellesley Cambridge Press , 2009.
[481]
G. Strang, and K. Borre. Linear algebra, geodesy, and GPS. Wellesley Cambridge Press , 1997.
[482]
K. Subbian, C. Aggarwal, and J. Srivasatava. Content-centric flow mining for influence analysis in social streams. CIKM Conference , pp. 841–846, 2013.
[483]
J. Sun, and J. Tang. A survey of models and algorithms for social influence analysis. Social Network Data Analytics , Springer, pp. 177–214, 2011.
[484]
Y. Sun, J. Han, C. Aggarwal, and N. Chawla. When will it happen?: relationship prediction in heterogeneous information networks. ACM international conference on Web search and data mining , pp. 663–672, 2012.
[485]
P.-N Tan, M. Steinbach, and V. Kumar. Introduction to data mining. Addison-Wesley , 2005.
[486]
P. N. Tan, V. Kumar, and J. Srivastava. Selecting the right interestingness measure for association patterns. ACM KDD Conference , pp. 32–41, 2002.
[487]
J. Tang, Z. Chen, A. W.-C. Fu, and D. W. Cheung. Enhancing effectiveness of outlier detection for low density patterns. PAKDD Conference , pp. 535–548, 2002.
[488]
J. Tang, J. Sun, C. Wang, and Z. Yang. Social influence analysis in large-scale networks. ACM SIGKDD international conference on Knowledge discovery and data mining , pp. 807–816, 2009.
[489]
B. Taskar, M. Wong, P. Abbeel, and D. Koller. Link prediction in relational data. Advances in Neural Information Processing Systems , 2003.
[490]
J. Tenenbaum, V. De Silva, and J. Langford. A global geometric framework for nonlinear dimensionality reduction. Science , 290 (5500), pp. 2319–2323, 2000.
[491]
K. Ting, and I. Witten. Issues in stacked generalization. Journal of Artificial Intelligence Research , 10, pp. 271–289, 1999.
[492]
T. Mitsa. Temporal data mining. CRC Press , 2010.
[493]
H. Toivonen. Sampling large databases for association rules. VLDB Conference , pp. 134–145, 1996.
[494]
V. Vapnik. The nature of statistical learning theory. Springer , 2000.
[495]
J. Vaidya. A survey of privacy-preserving methods across vertically partitioned data. Privacy-Preserving Data Mining: Models and AlgorithmsM , Springer, pp. 337–358, 2008.
[496]
V. Vapnik. Statistical learning theory. Wiley , 1998.
[497]
V. Verykios, and A. Gkoulalas-Divanis. A Survey of Association Rule Hiding Methods for Privacy. Privacy-Preserving Data Mining: Models and Algorithms, Springer, pp. 267–289, 2008.
[498]
J. S. Vitter. Random sampling with a reservoir. ACM Transactions on Mathematical Software (TOMS) , 11(1), pp. 37–57, 2006.
[499]
M. Vlachos, M. Hadjieleftheriou, D. Gunopulos, and E. Keogh. Indexing multi-dimensional time-series with support for multiple distance measures. ACM KDD Conference , pp. 216–225, 2003.
[500]
M. Vlachos, G. Kollios, and D. Gunopulos. Discovering similar multidimensional trajectories. IEEE International Conference on Data Engineering, pp. 673–684, 2002.
[501]
T. De Vries, S. Chawla, and M. Houle. Finding local anomalies in very high dimensional space. IEEE ICDM Conference , pp. 128–137, 2010.
[502]
A. Waddell, and R. Oldford. Interactive visual clustering of high dimensional data by exploring low-dimensional subspaces. INFOVIS , 2012.
[503]
H. Wang, W. Fan, P. Yu, and J. Han. Mining concept-drifting data streams using ensemble classifiers. ACM KDD Conference , pp. 226–235, 2003.
[504]
J. Wang, J. Han, and J. Pei. Closet+: Searching for the best strategies for mining frequent closed itemsets. ACM KDD Conference , pp. 236–245, 2003.
[505]
J. Wang, Y. Zhang, L. Zhou, G. Karypis, and C. C. Aggarwal. Discriminating subsequence discovery for sequence clustering. SIAM Conference on Data Mining , pp. 605–610, 2007.
[506]
W. Wang, J. Yang, and R. Muntz. STING: A statistical information grid approach to spatial data mining. VLDB Conference , pp. 186–195, 1997.
[507]
J. S. Walker. Fast fourier transforms. CRC Press , 1996.
[508]
S. Wasserman. Social network analysis: Methods and applications. Cambridge University Press , 1994.
[509]
D. Watts, and D. Strogatz. Collective dynamics of ‘small-world’ networks. Nature , 393 (6684), pp. 440–442, 1998.
[510]
L. Wei, E. Keogh, and X. Xi. SAXually Explicit images: Finding unusual shapes. IEEE ICDM Conference , pp. 711–720, 2006.
[511]
H. Wiener. Structural determination of paraffin boiling points. Journal of the American Chemical Society . 1(69). pp. 17–20, 1947.
[512]
L. Willenborg, and T. De Waal. Elements of statistical disclosure control. Springer , 2001.
[513]
D. Wolpert. Stacked generalization. Neural Networks , 5(2), pp. 241–259, 1992.
[514]
X. Xiao, and Y. Tao. Anatomy: Simple and effective privacy preservation. Very Large Databases Conference , pp. 139–150, 2006.
[515]
D. Xin, J. Han, X. Yan, and H. Cheng. Mining compressed frequent-pattern sets. VLDB Conference , pp. 709–720, 2005.
[516]
Z. Xing, J. Pei, and E. Keogh. A brief survey on sequence classification. SIGKDD Explorations Newsletter , 12(1), pp. 40–48, 2010.
[517]
H. Xiong, P. N. Tan, and V. Kumar. Mining strong affinity association patterns in data sets with skewed support distribution. ICDM Conference , pp. 387–394, 2003.
[518]
K. Yaminshi, J. Takeuchi, and G. Williams. Online unsupervised outlier detection using finite mixtures with discounted learning algorithms, ACM KDD Conference ,pp. 320–324, 2000.
[519]
X. Yan, and J. Han. gSpan: Graph-based substructure pattern mining. IEEE International Conference on Data Mining , pp. 721–724, 2002.
[520]
X. Yan, P. Yu, and J. Han. Substructure similarity search in graph databases. ACM SIGMOD Conference , pp. 766–777, 2005.
[521]
X. Yan, P. Yu, and J. Han. Graph indexing: a frequent structure-based approach. ACM SIGMOD Conference , pp. 335–346, 2004.
[522]
X. Yan, F. Zhu, J. Han, and P. S. Yu. Searching substructures with superimposed distance. International Conference on Data Engineering , pp. 88, 2006.
[523]
J. Yang, and W. Wang. CLUSEQ: efficient and effective sequence clustering. IEEE International Conference on Data Engineering , pp. 101–112, 2003.
[524]
D. Yankov, E. Keogh, J. Medina, B. Chiu, and V. Zordan. Detecting time series motifs under uniform scaling. ACM KDD Conference , pp. 844–853, 2007.
[525]
N. Ye. A markov chain model of temporal behavior for anomaly detection. IEEE Information Assurance Workshop , pp. 169, 2004.
[526]
B. K. Yi, H. V. Jagadish, and C. Faloutsos. Efficient retrieval of similar time sequences under time warping. IEEE International Conference on Data Engineering , pp. 201–208, 1998.
[527]
B. K. Yi, N. Sidiropoulos, T. Johnson, H. V. Jagadish, C. Faloutsos, and A. Biliris. Online data mining for co-evolving time sequences. International Conference on Data Engineering , pp. 13–22, 2000.
[528]
H. Yildirim, and M. Krishnamoorthy. A random walk method for alleviating the sparsity problem in collaborative filtering. ACM conference on Recommender systems , pp. 131–138, 2008.
[529]
X. Yin, and J. Han. CPAR: Classification based on predictive association rules. SIAM international conference on data mining , pp. 331–335, 2003.
[530]
S. Yu, and J. Shi. Multiclass spectral clustering. International Conference on Computer Vision , 2003.
[531]
B. Zadrozny, J. Langford, and N. Abe. Cost-sensitive learning by cost-proportionate example weighting. ICDM Conference , pp. 435–442, 2003.
[532]
R. Zafarani, M. A. Abbasi, and H. Liu. Social media mining: an introduction. Cambridge University Press , New York, 2014.
[533]
H. Zakerzadeh, C. Aggarwal, and K. Barker. Towards breaking the curse of dimensionality for high-dimensional privacy. SIAM Conference on Data Mining , pp. 731–739, 2014.
[534]
M. J. Zaki. Scalable algorithms for association mining. IEEE Transactions on Knowledge and Data Engineering , 12(3), pp. 372–390, 2000.
[535]
M. J. Zaki. SPADE: An efficient algorithm for mining frequent sequences. Machine learning , 42(1–2), pp. 31–60, 2001. 31–60.
[536]
M. J. Zaki, and M. Wagner Jr. Data mining and analysis: fundamental concepts and algorithms. Cambridge University Press , 2014.
[537]
M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. New algorithms for fast discovery of association rules. KDD Conference , pp. 283–286, 1997.
[538]
M. J. Zaki, and K. Gouda. Fast vertical mining using diffsets. ACM KDD Conference , pp. 326–335, 2003.
[539]
M. J. Zaki, and C. Hsiao. CHARM: An efficient algorithm for closed itemset mining. SIAM Conference on Data Mining , pp. 457–473, 2002.
[540]
M. J. Zaki, and C. Aggarwal. XRules: An effective algorithm for structural classification of XML data. Machine Learning , 62(1–2), pp. 137–170, 2006.
[541]
B. Zenko. Is combining classifiers better than selecting the best one? Machine Learning , pp. 255–273, 2004.
[542]
Y. Zhai, and B. Liu. Web data extraction based on partial tree alignment. World Wide Web Conference , pp. 76–85, 2005.
[543]
D. Zhan, M. Li, Y. Li, and Z.-H. Zhou. Learning instance specific distances using metric propagation. ICML Conference , pp. 1225–1232, 2009.
[544]
H. Zhang, A. Berg, M. Maire, and J. Malik. SVM-KNN: Discriminative nearest neighbor classification for visual category recognition. Computer Vision and Pattern Recognition , pp. 2126–2136, 2006.
[545]
J. Zhang, Z. Ghahramani, and Y. Yang. A probabilistic model for online document clustering with application to novelty detection. Advances in Neural Information Processing Systems , pp. 1617–1624, 2004.
[546]
J. Zhang, Q. Gao, and H. Wang. SPOT: A system for detecting projected outliers from high-dimensional data stream. ICDE Conference , 2008.
[547]
D. Zhang, and G. Lu. Review of shape representation and description techniques. Pattern Recognition , 37(1), pp. 1–19, 2004.
[548]
S. Zhang, W. Wang, J. Ford, and F. Makedon. Learning from incomplete ratings using nonnegative matrix factorization. SIAM Conference on Data Mining , pp. 549–553, 2006.
[549]
T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH: an efficient data clustering method for very large databases. ACM SIGMOD Conference , pp. 103–114, 1996.
[550]
Z. Zhao, and H. Liu. Spectral feature selection for supervised and unsupervised learning. ICML Conference , pp. 1151–1157, 2007.
[551]
D. Zhou, O. Bousquet, T. Lal, J. Weston, and B. Scholkopf. Learning with local and global consistency. Advances in Neural Information Processing Systems , 16(16), pp. 321–328, 2004.
[552]
D. Zhou, J. Huang, and B. Scholkopf. Learning from labeled and unlabeled data on a directed graph. ICML Conference , pp. 1036–1043, 2005.
[553]
F. Zhu, X. Yan, J. Han, P. S. Yu, and H. Cheng. Mining colossal frequent patterns by core pattern fusion. ICDE Conference , pp. 706–715, 2007.
[554]
X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using gaussian fields and harmonic functions. ICML Conference , pp. 912–919, 2003.
[555]
X. Zhu, and A. Goldberg. Introduction to semi-supervised learning. Morgan and Claypool , 2009.
Index
A
χ 2 Measure
_ -diversity
k -anonymity
t -closeness
AdaBoost
Agglomerative Clustering
Aggregate Change Points
Almost Closed Sets
AMS Sketch
Approximate Frequent Patterns
Apriori Algorithm
AR Model
ARIMA Model
ARMA Model
Association Pattern Mining
Association Rule Hiding
Association Rules
Associative Classifiers
Authorities
Autoregressive Integrated Moving Average
Model
Autoregressive Model
Autoregressive Moving Average Model
AVC-set
B
Bag-of-Words Kernel
Bagging
Balaban Index
Barabasi-Albert Model
Baum-Welch Algorithm
Bayes Classifier
Bayes Optimal Privacy
Bayes Reconstruction Method
Bayes Text Classifier
Behavioral Attributes
Bernoulli Bayes Model
Between-Class Scatter Matrix
Betweenness Centrality
Bias Term in SVMs
Biased Sampling
Big Data
Binarization
Binning of Time Series
Biological Sequences
BIRCH
Bisecting K-Means
Bloom Filter
BOAT
Boosting
Bootstrap
Bootstrapped Aggregating
Bucket of Models
Buckshot
C
C4.5rules
Candidate Distribution Algorithm
Cascade
Categorical Data Clustering
CBA
Centrality
Centroid Distance Signature
Centroid-based Text Classification
Chebychev Inequality
Chernoff Bound (Lower-Tail)
Chernoff Bound (Upper-Tail)
Circuit Rank
CLARA
CLARANS
Classification
Classification Based on Associations
Classification of Time Series
Classifier Evaluation
Classifying Graphs
Cleaning Data
CLIQUE
Closed Itemsets
Closed Patterns
Closeness Centrality
CLUSEQ
Cluster Digest for Text
Cluster Validation
Clustering
Clustering Coefficient
Clustering Data Streams
Clustering Graphs
Clustering Tendency
Clustering Text
Clustering Time Series
Clusters and Outliers
CluStream
Co-clustering
Co-clustering for Recommendations
Co-location Patterns
Co-Training
Coefficient of Determination
Collaborative Filtering
Collective Classification
Combination Outliers in Sequences
Community Detection
Compression-based Dissimilarity Measure,
513
Concept Drift
Condensation-based Anonymization
Confidence
Confidence Monotonicity
Constrained Clustering
Constrained Pattern Mining
Constrained Sequential Patterns
Content-based Recommendations
Contextual Attributes
CONTOUR
Coordinate Descent
Core of Joined Subgraphs
Count-Min Sketch
Cross-Validation
CSketch
CURE
CVFDT
Cyclomatic Number
D
Data Classification
Data Cleaning
Data Clustering
Data Reduction
Data Streams
Data Type Portability
Data Types, 6
Data-centered Ensembles
DBSCAN
Decision List
Decision Trees
Degree Centrality
Degree Prestige
DENCLUE
Dendrogram
Densification
Density Attractors
DepthProject Algorithm
Differencing Time Series
Diffusion Models
Dijkstra Algorithm
Dimensionality Curse in Privacy
Dimensionality Reduction
Discrete Cosine Transform
Discrete Fourier Transform
Discrete Sequence Similarity Measures
Discretization
Discriminative Classifier
Distance-based Clustering
Distance-based Entropy
Distance-based Motifs
Distance-based Outlier Detection
Distance-based Sequence Clustering
Distance-based Sequence Outliers
Distributed Privacy
Document Preparation
Document-Term Matrix, 8
Domain Generalization Hierarchy
Downward Closure Property
DWT
Dynamic Programming in HMM
Dynamic Time Warping Distance
Dynamics of Network Formation
E
Early Termination Trick
Earth Mover Distance
Eckart-Young Theorem
Eclat
Edit Distance
Edit Distance in Graphs
Eigenvector Centrality
EM Algorithm for Continuous Data
EM Algorithm for Data Clustering
Embedded Models
Energy of a Data Set
Ensemble Classification
Ensemble Clustering
Ensemble-based Streaming Classification
Entropy
Entropy _ -diversity
Enumeration Tree
Equivalence Class in Privacy
Error Tree of Wavelet Representation
Estrada Index
Euclidean Metric
Event Detection
Evolutionary Outlier Algorithms
Example Re-weighting
Expected Error Reduction
Expected Model Change
Expected Variance Reduction
Explaining Sequence Anomalies
Exponential Smoothing
Extreme Value Analysis
F
Feature Bagging
Feature Selection
Feature Selection for Classification
Feature Selection for Clustering
Filter Models
Finite State Automaton
First Story Detection
Fisher Score
Fisher’s Linear Discriminant
Flajolet-Martin Algorithm
FOIL’s Information Gain
Forward Algorithm
Forward-backward Algorithm
Fowlkes-Mallows Measure
Fractionation
Frequency-based Sequence Outliers
Frequent Itemset
Frequent Pattern Mining
Frequent Pattern Mining in Streams
Frequent Substructure Mining
Frequent Trajectory Paths
Frequent Traversal Patterns
Full-Domain Generalization
G
Generalization in Privacy
Generalization Property
Generalized Linear Models
Generative Classifier
Geodesic Distances
Gini Index
Girvan-Newman Algorithm
GLM
Global Recoding
Global Statistical Similarity
Goodall Measure
Graph Classification
Graph Clustering
Graph Database
Graph Distances and Matching
Graph Edit Distance
Graph Isomorphism
Graph Kernels
Graph Matching
Graph Similarity Measures
Graph-based Algorithms
Graph-based Collaborative Filtering
Graph-based Methods
Graph-based Semisupervised Learning
Graph-based Sequence Clustering
Graph-based Spatial Neighborhood
Graph-based Spatial Outliers
Graph-based Time-Series Clustering
Gregariousness in Social Networks
Grid-based Outliers
Grid-based Projected Outliers
GSP Algorithm
H
Haar Wavelets
Heavy Hitters
Hidden Markov Model Clustering
Hidden Markov Models
Hierarchical Clustering Algorithms
High Dimensional Privacy
Hinge Loss
Histogram-based Outliers
HITS
HMETIS
HMM
HMM Applications
Hoeffding Inequality
Hoeffding Trees
Holdout
Homophily
Hopkin’s Statistic
Hosoya Index
HOTSAX
Hubs
Hybrid Feature Selection
I
Imputation
Incognito
Incognito Super-roots
Inconsistent Data
Independent Cascade Model
Independent Ensembles
Inductive Classifiers
Influence Analysis
Information Gain
Information Theoretic Measures
Instance-based Learning
Instance-based Text Classification
Interest Ratio
Internal Validation Criteria
Intrinsic Dimensionality
Inverse Document Frequency
Inverse Occurrence Frequency
Inverted Index
ISOMAP
Item-based Recommendations
Itemset
Iterative Classification Algorithm
J
Jaccard Coefficient
Jaccard for Multiway Similarity
K
K-Means
K-Medians
K-Medoids
K-Modes
Katz Centrality
Kernel Density Estimation
Kernel Fisher’s Discriminant
Kernel K-Means
Kernel Logistic Regression
Kernel PCA
Kernel Ridge Regression
Kernel SVM
Kernel Trick
Kernels in Graphs
Kernighan-Lin Algorithm
Keyword-based Sequence Similarity
Kruskal Stress
L
Label Propagation Algorithm
Lagrangian Optimization in NMF
Large Itemset
Lasso
Latent Components of NMF
Latent Components of SVD
Latent Factor Models
Latent Semantic Indexing
Law Enforcement
Lazy Learners
Learn-One-Rule
Leave-One-Out Bootstrap
Leave-One-Out Cross-Validation
Left Eigenvector
Level-wise Algorithms
Levenshtein Distance
Lexicographic Tree
Likelihood Ratio Statistic
Linear Discriminant Analysis
Linear Threshold Model
Link Prediction
Link Prediction for Recommendations
Loadshedding
Local Outlier Factor
Local Recoding
LOF
Logistic Regression
Longest Common Subsequence
Lookahead-based Pruning
Lossy Counting Algorithm
LSA
M
MA Model
Macro-clustering
Mahalanobis k -means
Mahalanobis Distance
Manhattan Metric
Margin
Margin Constraints
Markov Inequality
Massive-Domain Stream Clustering
Massive-Domain Streaming Classification
Match-based Distance Measures in Graphs
Maximal Frequent Itemsets
Maximum Common Subgraph
Maximum Common Subgraph Problem
Mean-Shift Clustering
Mercer Kernel Map
Mercer’s Theorem
METIS
Metric
Micro-clustering
Min-Max Scaling
Minkowski Distance
Missing Data
Missing Time-Series Values
Mixture Modeling
Model Selection
Model-centered Ensembles
Mondrian Algorithm
Moore-Penrose Pseudoinverse
Morgan Index
Motif Discovery
Moving Average Model
Moving Average Smoothing
Multiclass Learning
Multidimensional Change Points
Multidimensional Scaling
Multidimensional Spatial Neighborhood
Multidimensional Spatial Outliers
Multilayer Neural Network
Multinomial Bayes Model
Multivariate Extreme Values
Multivariate Time Series
Multivariate Time-Series Forecasting
Multiview Clustering
N
Naive Bayes Classifier
NCSA Common Log Format
Near Duplicate Detection
Nearest Neighbor Classifier
Neighborhood-based Collaborative Filtering
Network Data
Neural Networks
NMF
Node-Induced Subgraph
Noise Removal from Time Series
Non-stationary Time Series
Nonlinear Regression
Nonlinear Support Vector Machines
Nonnegative Matrix Factorization
Normalization
Normalization of Time Series
Normalized Wavelet Basis
Novelties in Text
O
Oblivious Transfer Protocol
One-Against-One Multiclass Learning
One-Against-Rest Multiclass Learning
Online Novelty Detection
Online Time-Series Clustering
ORCLUS
Ordered Probit Regression
Outlier Analysis
Outlier Detection
Outlier Ensembles
Outlier Validity
Output Privacy
Overfitting
P
PAA
PageRank
Partial Periodic Patterns
Partition Algorithm
Partition-1
PCA
Perceptron
Periodic Patterns
Perturbation for Privacy
Pessimistic Error Rate
Piecewise Aggregate Approximation
PLSA
Point Outliers in Time Series
Poisson Regression
Polynomial Regression
Pool-based Active Learning
Position Outliers in Sequences
Power-Iteration Method
Power-Law Degree Distribution
Predictive Attribute Dependence
Preferential Attachment
Preferential Crawlers
Prestige
Principal Component Analysis
Principal Components Regression
Privacy-Preserving Data Mining
Privacy-Preserving Data Publishing
Probabilistic Classifiers
Probabilistic Clustering
Probabilistic Latent Semantic Analysis
Probabilistic Outlier Detection
Probabilistic Suffix Trees
Probabilistic Text Clustering
Probit Regression
PROCLUS
Product Graph
Profile Association Rules
Projected Outliers
Projection-based Reuse
Projection-based Reuse of Support Counting,
107
Proximal Gradient Methods
Proximity Models for Mixed Data
Proximity Prestige
PST
Pyramidal Time Frame
Q
Query Auditing
Query-by-Committee
Querying Patterns
QuickSI Algorithm
R
RainForest
Randic Index
Random Forests
Random Subspace Ensemble
Random Subspace Sampling
Random Walks
Random-Walk Kernels
Randomization for Privacy
Rank Prestige
Ranking Algorithms
Rare Class Learning
Ratings Matrix
Recommendations
Recommender Systems
Recursive ( c, _ )-diversity
Regression Modeling
Regularization
Regularization in Collective Classification
Rendezvous Label Propagation
Representative-based Clustering
Representativeness-based Active Learning
Reservoir Sampling
Response Variable
Ridge Regression
Right Eigenvector
RIPPER
Rocchio Classification
ROCK
S
Samarati’s Algorithm
Sampling
SAX
Scalable Classification
Scalable Clustering
Scalable Decision Trees
Scale-Free Networks
Scaling
Scatter Gather Text Clustering
Secure Multi-party Computation
Secure Set Union Protocol
Selective Sampling
Self Training
Semisupervised Bayes Classification
Semisupervised Clustering
Semisupervised Learning
Sensor-Selection
Sequence Classification
Sequence Data
Sequence Outlier Detection
Sequential Covering Algorithms
Sequential Ensembles
Sequential Pattern Mining
Shape Analysis
Shape Clustering
Shape Outliers
Shape-based Time-Series Clustering
Shared Nearest Neighbors
Shingling
Short Memory Property
Shortest Path Kernels
Shrinking Diameters
Signature Table
Similarity Computation with Mixed Data
Simple Matching Coefficient
Simple Redundancy
SimRank
Singular Value Decomposition
Small World Networks
SMOTE
Social Influence Analysis
Soft SVM
Spatial Co-location Patterns
Spatial Data
Spatial Data Mining
Spatial Outliers
Spatial Tile Transformation
Spatial Wavelets
Spatiotemporal Data
Spectral Clustering
Spectral Decomposition
Spectral Methods in Collective Classification,
646
Spectrum Kernel
Spider Traps
Spiders
SPIRIT
Stacking
Standardization
Stationary Time Series
Stop-word Removal
STORM
Stratified Cross-Validation
Stratified Sampling
STREAM Algorithm
Streaming Classification
Streaming Data
Streaming Frequent Pattern Mining
Streaming Novelty Detection
Streaming Outlier Detection
Streaming Privacy
Streaming Synopsis
Strict Redundancy
String Data
Subgraph Isomorphism
Subgraph Matching
Subsequence
Subsequence-based Clustering
Superset-based Pruning
Supervised Feature Selection
Supervised Micro-clusters for Classification
Support
Support Vector Machines
Support Vectors
Suppression in Privacy
SVD
SVM for Text
SVMLight
SVMPerf
Symbolic Aggregate Approximation
Symmetric Confidence Measure
Synopsis for Streams
Synthetic Data for Anonymization
Synthetic Over-sampling
System Diagnosis
T
Tag Trees
TARZAN
Temporal Similarity Measures
Term Strength
Text Classification
Text Clustering
Text SVM
Tikhonov Regularization
Time Series Similarity Measures
Time Warping
Time-Series Classification
Time-Series Correlation Clustering
Time-Series Data, 9
Time-Series Data Mining
Time-Series Forecasting
Time-Series Preparation
Topic Modeling
Topic-Sensitive PageRank
Topological Descriptors
Trajectory Classification
Trajectory Clustering
Trajectory Mining
Trajectory Outlier Detection
Trajectory Pattern Mining
Transductive Classifiers
Transductive Support Vector Machines
TreeProjection Algorithm
Triadic Closure
U
Ullman’s Isomorphism Algorithm
Uncertainty Sampling
Universal Crawlers
Unsupervised Feature Selection
User-based Recommendations
Utility in Privacy
Utility Matrix
V
Value Generalization Hierarchy
Velocity Density Estimation
Vertical Counting Methods
VF2 Algorithm
Viterbi Algorithm
W
Ward’s Method
Wavelet-based Rules
Wavelets
Web Crawling
Web Document Processing
Web Resource Discovery
Web Server Logs
Web Usage Mining
Weighted Degree Kernel
Wiener Index
Within-Class Scatter Matrix
Wrapper Models
X
XProj
XRules
Z
Z-Index