120.kmeans
SPEC ACCEL Benchmark Description File

Benchmark Name

120.kmeans


Benchmark Author

University of Virginia


Benchmark Program General Category

Dense Linear Algebra, Data Mining


Benchmark Description

K-means is a clustering algorithm used extensively in data-mining and elsewhere, important primarily for its simplicity. Many data-mining algorithms show a high degree of data parallelism. In k-means, a data object is comprised of several values, called features. By dividing a cluster of data objects into K sub-clusters, k-means represents all the data objects by the mean values or centroids of their respective sub-clusters. The initial cluster center for each sub-cluster is randomly chosen or derived from some heuristic. In each iteration, the algorithm associates each data object with its nearest center, based on some chosen distance metric. The new centroids are calculated by taking the mean of all the data objects within each sub-cluster respectively. The algorithm iterates until no data objects move from one sub-cluster to another.


Input Description

The input used for the test is a text file of features and attributes. Each line is a new feature containing its attributes.


Output Description

The program reports the cluster center coordiantes.

The output file kmeans.out contains detailed timing information about the run. It also shows which device was selected along with what devices where available to OpenCL. Status updates of the run are also included.


Programming Language

C++


Known portability issues

None


Reference

https://www.cs.virginia.edu/~skadron/wiki/rodinia/index.php/Main_Page

[1] J. Pisharath, Y. Liu, W. Liao, A. Choudhary, G. Memik, and J. Parhi. NU-MineBench 2.0. Technical Report CUCIS-2005-08-01, Department of Electrical and Computer Engineering, Northwestern University, Aug 2005

[2] S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron. Rodinia: A Benchmark Suite for Heterogeneous Computing. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC), pp. 44-54, Oct. 2009.

[3] S. Che, J. W. Sheaffer, M. Boyer, L. G. Szafaryn, L. Wang, and K. Skadron. A Characterization of the Rodinia Benchmark Suite with Comparison to Contemporary CMP Workloads. In Proceedings of the IEEE International Symposium on Workload Characterization, Dec. 2010.


Last Updated: February 03, 2014