Cluster tendency assessment in neuronal spike data

Autoři: Sara Mahallati aff001;  James C. Bezdek aff004;  Milos R. Popovic aff001;  Taufik A. Valiante aff001
Působiště autorů: Institute of Biomaterials and Biomedical Engineering, University of Toronto, Toronto, Canada aff001;  KITE Research Institute, University Health Network, Toronto, Canada aff002;  Krembil Research Institute, University Health Network, Toronto, Canada aff003;  Computer Science and Information Systems Departments, University of Melbourne, Melbourne, Australia aff004;  Division of Neurosurgery, University of Toronto, Toronto, Canada aff005;  CRANIA, University Health Network and University of Toronto, Toronto, Canada aff006
Vyšlo v časopise: PLoS ONE 14(11)
Kategorie: Research Article
doi: 10.1371/journal.pone.0224547


Sorting spikes from extracellular recording into clusters associated with distinct single units (putative neurons) is a fundamental step in analyzing neuronal populations. Such spike sorting is intrinsically unsupervised, as the number of neurons are not known a priori. Therefor, any spike sorting is an unsupervised learning problem that requires either of the two approaches: specification of a fixed value k for the number of clusters to seek, or generation of candidate partitions for several possible values of c, followed by selection of a best candidate based on various post-clustering validation criteria. In this paper, we investigate the first approach and evaluate the utility of several methods for providing lower dimensional visualization of the cluster structure and on subsequent spike clustering. We also introduce a visualization technique called improved visual assessment of cluster tendency (iVAT) to estimate possible cluster structures in data without the need for dimensionality reduction. Experimental results are conducted on two datasets with ground truth labels. In data with a relatively small number of clusters, iVAT is beneficial in estimating the number of clusters to inform the initialization of clustering algorithms. With larger numbers of clusters, iVAT gives a useful estimate of the coarse cluster structure but sometimes fails to indicate the presumptive number of clusters. We show that noise associated with recording extracellular neuronal potentials can disrupt computational clustering schemes, highlighting the benefit of probabilistic clustering models. Our results show that t-Distributed Stochastic Neighbor Embedding (t-SNE) provides representations of the data that yield more accurate visualization of potential cluster structure to inform the clustering stage. Moreover, The clusters obtained using t-SNE features were more reliable than the clusters obtained using the other methods, which indicates that t-SNE can potentially be used for both visualization and to extract features to be used by any clustering algorithm.

Klíčová slova:

Action potentials – Algorithms – Clustering algorithms – Data visualization – k means clustering – Neurons – principal component analysis – Vision


1. Buzsáki G. Large-Scale Recording of Neuronal Ensembles. Nature Neuroscience. 2004;7(5):446–451. doi: 10.1038/nn1233 15114356

2. Brown EN, Kass RE, Mitra PP. Multiple Neural Spike Train Data Analysis: State-of-the-Art and Future Challenges. Nature Neuroscience. 2004;7(5):456–461. doi: 10.1038/nn1228 15114358

3. Ventura V, Gerkin RC. Accurately Estimating Neuronal Correlation Requires a New Spike-Sorting Paradigm. Proceedings of the National Academy of Sciences of the United States of America. 2012;109(19):7230–7235. doi: 10.1073/pnas.1115236109 22529350

4. Rossant C, Kadir SN, Goodman DFM, Schulman J, Hunter MLD, Saleem AB, et al. Spike sorting for large, dense electrode arrays. Nature Neuroscience. 2016;19(4):634–641. doi: 10.1038/nn.4268 26974951

5. Shoham S, O’Connor DH, Segev R. How Silent Is the Brain: Is There a “Dark Matter” Problem in Neuroscience? Journal of Comparative Physiology A. 2006;192(8):777–784. doi: 10.1007/s00359-006-0117-6

6. Pedreira C, Martinez J, Ison MJ, Quian Quiroga R. How Many Neurons Can We See with Current Spike Sorting Algorithms? Journal of Neuroscience Methods. 2012;211(1):58–65. doi: 10.1016/j.jneumeth.2012.07.010 22841630

7. Niediek J, Boström J, Elger CE, Mormann F. Reliable Analysis of Single-Unit Recordings from the Human Brain under Noisy Conditions: Tracking Neurons over Hours. PloS one. 2016;11(12):e0166598. doi: 10.1371/journal.pone.0166598 27930664

8. Pachitariu M, Steinmetz N, Kadir S, Carandini M, Harris KD. Kilosort: Realtime Spike-Sorting for Extracellular Electrophysiology with Hundreds of Channels. bioRxiv. 2016; p. 061481.

9. Yger P, Spampinato GLB, Esposito E, Lefebvre B, Deny S, Gardella C, et al. Fast and Accurate Spike Sorting in Vitro and in Vivo for up to Thousands of Electrodes. bioRxiv. 2016; p. 067843.

10. Jorgenson LA, Newsome WT, Anderson DJ, Bargmann CI, Brown EN, Deisseroth K, et al. The BRAIN Initiative: Developing Technology to Catalyse Neuroscience Discovery. Philosophical Transactions of the Royal Society B: Biological Sciences. 2015;370 (1668). doi: 10.1098/rstb.2014.0164

11. Markram H, Muller E, Ramaswamy S, Reimann MW, Abdellah M, Sanchez CA, et al. Reconstruction and Simulation of Neocortical Microcircuitry. Cell. 2015;163(2):456–492. doi: 10.1016/j.cell.2015.09.029 26451489

12. Armañanzas R, Ascoli GA. Towards Automatic Classification of Neurons. Trends in neurosciences. 2015;38(5):307–318. doi: 10.1016/j.tins.2015.02.004 25765323

13. Bezdek JC. A Primer on Cluster Analysis: 4 Basic Methods That (Usually) Work. Sarasota, FL: First Edition Design Publishing; 2017.

14. Dehghani N, Peyrache A, Telenczuk B, Le Van Quyen M, Halgren E, Cash SS, et al. Dynamic Balance of Excitation and Inhibition in Human and Monkey Neocortex. Scientific Reports. 2016;6(1). doi: 10.1038/srep23176

15. Pazienti A, Grün S. Robustness of the Significance of Spike Synchrony with Respect to Sorting Errors. Journal of Computational Neuroscience. 2006;21(3):329–342. doi: 10.1007/s10827-006-8899-7 16927209

16. Cohen MR, Kohn A. Measuring and Interpreting Neuronal Correlations. Nature neuroscience. 2011;14(7):811–819. doi: 10.1038/nn.2842 21709677

17. Einevoll GT, Franke F, Hagen E, Pouzat C, Harris KD. Towards Reliable Spike-Train Recordings from Thousands of Neurons with Multielectrodes. Current Opinion in Neurobiology. 2012;22(1):11–17. doi: 10.1016/j.conb.2011.10.001 22023727

18. Henze DA, Harris KD, Borhegyi Z, Csicsvari J, Mamiya A, Hirase H, et al. Simultaneous Intracellular and Extracellular Recordings from Hippocampus Region CA1 of Anesthetized Rats. 2009;

19. Zhao Z, Wang L, Liu H, Ye J. On Similarity Preserving Feature Selection. IEEE Transactions on Knowledge and Data Engineering. 2013;25(3):619–632. doi: 10.1109/TKDE.2011.222

20. van der Maaten L, Postma E, Van den Herik J. Dimensionality reduction: A comparative review. Journal of Machine Learning Research. 2009;10:66–71.

21. Campadelli P, Casiraghi E, Ceruti C, Rozza A. type [; 2015]

22. Theodoridis S. Pattern Recognition. Koutroumbas K, editor. London: Academic Press; 2009.

23. Hattori S, Chen L, Weiss C, Disterhoft JF. Robust Hippocampal Responsivity during Retrieval of Consolidated Associative Memory. Hippocampus. 2015;25(5):655–669. doi: 10.1002/hipo.22401

24. Truccolo W, Donoghue JA, Hochberg LR, Eskandar EN, Madsen JR, Anderson WS, et al. Single-Neuron Dynamics in Human Focal Epilepsy. Nature Neuroscience. 2011;14(5):635–641. doi: 10.1038/nn.2782 21441925

25. Takahashi S, Anzai Y, Sakurai Y. Automatic Sorting for Multi-Neuronal Activity Recorded With Tetrodes in the Presence of Overlapping Spikes. Journal of Neurophysiology. 2003;89(4):2245–2258. doi: 10.1152/jn.00827.2002 12612049

26. Quiroga RQ, Nadasdy Z, Ben-Shaul Y. Unsupervised Spike Detection and Sorting with Wavelets and Superparamagnetic Clustering. Neural Computation. 2004;16(8):1661–1687. doi: 10.1162/089976604774201631 15228749

27. van der Maaten L, Hinton G. Visualizing Data Using T-SNE. Journal of Machine Learning Research. 2008;9(Nov):2579–2605.

28. Mahallati S, Bezdek JC, Kumar D, Popovic MR, Valiante TA. Interpreting Cluster Structure in Waveform Data with Visual Assessment and Dunn’s Index. In: Frontiers in Computational Intelligence. Studies in Computational Intelligence. Springer, Cham; 2018. p. 73–101.

29. Sammon JW. A nonlinear mapping for data structure analysis. IEEE Transactions on Computers. 1969;100(5):401–409. doi: 10.1109/T-C.1969.222678

30. Tenenbaum JB, De Silva V, Langford JC. A global geometric framework for nonlinear dimensionality reduction. Science. 2000;290(5500):2319–2323. doi: 10.1126/science.290.5500.2319 11125149

31. Havens TC, Bezdek JC. An Efficient Formulation of the Improved Visual Assessment of Cluster Tendency (iVAT) Algorithm. IEEE Transactions on Knowledge and Data Engineering. 2012;24(5):813–822. doi: 10.1109/TKDE.2011.33

32. Bezdek JC, Hathaway RJ. VAT: A Tool for Visual Assessment of (Cluster) Tendency. In: Proceedings of the 2002 International Joint Conference on Neural Networks, 2002. IJCNN’02. vol. 3; 2002. p. 2225–2230.

33. Prim RC. Shortest Connection Networks And Some Generalizations. Bell System Technical Journal. 1957;36(6):1389–1401. doi: 10.1002/j.1538-7305.1957.tb01515.x

34. Jensen AM, Tregellas JR, Sutton B, Xing F, Ghosh D. Kernel Machine Tests of Association between Brain Networks and Phenotypes. PLOS ONE. 2019;14(3):e0199340. doi: 10.1371/journal.pone.0199340 30897094

35. Negi SK, Guda C. Global Gene Expression Profiling of Healthy Human Brain and Its Application in Studying Neurological Disorders. Scientific Reports. 2017;7(1):897. doi: 10.1038/s41598-017-00952-9 28420888

36. Kumar D, Bezdek JC, Rajasegarar S, Leckie C, Palaniswami M. A Visual-Numeric Approach to Clustering and Anomaly Detection for Trajectory Data. The Visual Computer. 2017;33(3):265–281. doi: 10.1007/s00371-015-1192-x

37. Dunn JC. A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters. Journal of Cybernetics. 1973;3(3):32–57. doi: 10.1080/01969727308546046

38. Bezdek JC, Pal NR. Some New Indexes of Cluster Validity. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics). 1998;28(3):301–315. doi: 10.1109/3477.678624

39. Arbelaitz O, Gurrutxaga I, Muguerza J, Pérez JM, Perona I. An Extensive Comparative Study of Cluster Validity Indices. Pattern Recognition. 2013;46(1):243–256. doi: 10.1016/j.patcog.2012.07.021

40. Hubert L, Arabie P. Comparing Partitions. Journal of Classification. 1985;2(1):193–218. doi: 10.1007/BF01908075

41. Kumar D, Bezdek JC, Palaniswami M, Rajasegarar S, Leckie C, Havens TC. A Hybrid Approach to Clustering in Big Data. IEEE Transactions on Cybernetics. 2016;46(10):2372–2385. doi: 10.1109/TCYB.2015.2477416 26441434

42. Henze DA, Borhegyi Z, Csicsvari J, Mamiya A, Harris KD, Buzsáki G. Intracellular Features Predicted by Extracellular Recordings in the Hippocampus In Vivo. Journal of Neurophysiology. 2000;84(1):390–400. doi: 10.1152/jn.2000.84.1.390 10899213

43. Lei Y, Bezdek JC, Romano S, Vinh NX, Chan J, Bailey J. Ground Truth Bias in External Cluster Validity Indices. Pattern Recognition. 2017;65:58–70. doi: 10.1016/j.patcog.2016.12.003

44. Dayan P, Abbott LF. THEORETICAL NEUROSCIENCE. 2001; p. 432.

45. Todorova S, Sadtler P, Batista A, Chase S, Ventura V. To Sort or Not to Sort: The Impact of Spike-Sorting on Neural Decoding Performance. Journal of neural engineering. 2014;11(5):056005. doi: 10.1088/1741-2560/11/5/056005 25082508

46. Rullen RV, Thorpe SJ. Rate Coding Versus Temporal Order Coding: What the Retinal Ganglion Cells Tell the Visual Cortex. Neural Computation. 2001;13(6):1255–1283. doi: 10.1162/08997660152002852 11387046

47. Mehta MR, Lee AK, Wilson MA. Role of Experience and Oscillations in Transforming a Rate Code into a Temporal Code. Nature. 2002;417(6890):741–746. doi: 10.1038/nature00807 12066185

48. Huxter J, Burgess N, O’Keefe J. Independent Rate and Temporal Coding in Hippocampal Pyramidal Cells. Nature. 2003;425(6960):828–832. doi: 10.1038/nature02058 14574410

49. Zuo Y, Safaai H, Notaro G, Mazzoni A, Panzeri S, Diamond ME. Complementary Contributions of Spike Timing and Spike Rate to Perceptual Decisions in Rat S1 and S2 Cortex. Current Biology. 2015;25(3):357–363. doi: 10.1016/j.cub.2014.11.065 25619766

50. Akam T, Kullmann DM. Oscillatory Multiplexing of Population Codes for Selective Communication in the Mammalian Brain. Nature Reviews Neuroscience. 2014;15(2):111–122. doi: 10.1038/nrn3668 24434912

51. Comaniciu D, Meer P. Mean Shift: A Robust Approach toward Feature Space Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2002;24:603–619. doi: 10.1109/34.1000236

52. Rodriguez A, Laio A. Clustering by Fast Search and Find of Density Peaks. Science. 2014;344(6191):1492–1496. doi: 10.1126/science.1242072 24970081

53. Havens TC, Bezdek JC, Keller JM, Popescu M, Huband JM. Is VAT Really Single Linkage in Disguise? Annals of Mathematics and Artificial Intelligence. 2009;55(3-4):237. doi: 10.1007/s10472-009-9157-2

54. Rathore P, Kumar D, Bezdek JC, Rajasegarar S, Palaniswami MS. A Rapid Hybrid Clustering Algorithm for Large Volumes of High Dimensional Data. IEEE Transactions on Knowledge and Data Engineering. 2018; p. 1–1.

55. van der Maaten L. Accelerating T-SNE Using Tree-Based Algorithms. Journal of Machine Learning Research. 2014;15:3221–3245.

56. Dimitriadis G, Neto J, Kampff A. T-SNE Visualization of Large-Scale Neural Recordings. bioRxiv. 2016; p. 087395.

57. Harris KD, Henze DA, Csicsvari J, Hirase H, Buzsáki G. Accuracy of Tetrode Spike Separation as Determined by Simultaneous Intracellular and Extracellular Measurements. Journal of Neurophysiology. 2000;84(1):401–414. doi: 10.1152/jn.2000.84.1.401 10899214

58. Buzsáki G, Anastassiou CA, Koch C. The Origin of Extracellular Fields and Currents—EEG, ECoG, LFP and Spikes. Nature Reviews Neuroscience. 2012;13(6):407–420. doi: 10.1038/nrn3241 22595786

59. Harris KD, Quiroga RQ, Freeman J, Smith SL. Improving Data Quality in Neuronal Population Recordings. Nature Neuroscience. 2016;19(9):1165–1174. doi: 10.1038/nn.4365 27571195

60. Abeles M, Goldstein MH. Multispike Train Analysis. Proceedings of the IEEE. 1977;65(5):762–773. doi: 10.1109/PROC.1977.10559

61. Aksenova TI, Chibirova OK, Dryga OA, Tetko IV, Benabid AL, Villa AEP. An Unsupervised Automatic Method for Sorting Neuronal Spike Waveforms in Awake and Freely Moving Animals. Methods. 2003;30(2):178–187. doi: 10.1016/S1046-2023(03)00079-3 12725785

62. Wouters J, Kloosterman F, Bertrand A. Towards Online Spike Sorting for High-Density Neural Probes Using Discriminative Template Matching with Suppression of Interfering Spikes. Journal of Neural Engineering. 2018;15(5):056005. doi: 10.1088/1741-2552/aace8a 29932426

63. Paralikar KJ, Rao CR, Clement RS. New Approaches to Eliminating Common-Noise Artifacts in Recordings from Intracortical Microelectrode Arrays: Inter-Electrode Correlation and Virtual Referencing. Journal of Neuroscience Methods. 2009;181(1):27–35. doi: 10.1016/j.jneumeth.2009.04.014 19394363

64. Takekawa T, Ota K, Murayama M, Fukai T. Spike Detection from Noisy Neural Data in Linear-Probe Recordings. European Journal of Neuroscience. 2014;39(11):1943–1950. doi: 10.1111/ejn.12614 24827558

65. Pillow JW, Shlens J, Chichilnisky EJ, Simoncelli EP. A Model-Based Spike Sorting Algorithm for Removing Correlation Artifacts in Multi-Neuron Recordings. PLoS ONE. 2013;8(5):e62123. doi: 10.1371/journal.pone.0062123 23671583

66. Vendramin L, Campello RJGB, Hruschka ER. Relative Clustering Validity Criteria: A Comparative Overview. Statistical Analysis and Data Mining. 2010;3(4):209–235.

67. Vega-Pons S, Ruiz-Shulcloper J. A SURVEY OF CLUSTERING ENSEMBLE ALGORITHMS. International Journal of Pattern Recognition and Artificial Intelligence. 2011;25(03):337–372. doi: 10.1142/S0218001411008683

68. Fournier J, Mueller CM, Shein-Idelson M, Hemberger M, Laurent G. Consensus-Based Sorting of Neuronal Spike Waveforms. PLOS ONE. 2016;11(8):e0160494. doi: 10.1371/journal.pone.0160494 27536990

69. Fisher RA. Statistical Methods for Research Workers. –. 13th ed. Edinburgh: Oliver and Boyd; 1958.

70. Mahallati S, Bezdek JC, Popovic MR, Valiante T. Cluster Tendency Assessment in Neuronal Spike Data. bioRxiv. 2018; p. 285064.

Článek vyšel v časopise


2019 Číslo 11