Binned Scatter



Binned scatters although might take a bit getting used to are great for aggregating data and cleaning it up to make some comparisons between groups. Fantastic in noisy data, we find binned scatters exceptionally useful in minerals processing data. 


This is article 6/7 in our series focussing on the bread and butter visualisations our analytics team use. To receive a copy of the whole series now go to,


Critical in exploring the data and significantly useful in evaluation; insight gained from visualisations inform the whole analytical workflow. Some types of visualisations lend themselves better to one use over the other and, making the decision of which to use in each application is a learning process. At Interlate, our experience with analytics in Minerals processing plants has given us an appreciation for how to make these decisions, and we will outline some of our insights in the following series. 



Good For: 



  • Identifying trends clearly across the data: A summary statistic (such as the mean or median) overlaid on each bin with confidence intervals allows easy visualisation of any patterns. It also clears up a lot of the noise that is often present in high density data. T-tests can be then used to quantify the average differences 
  • Overlaying with Box and Whisker plots in order to visualise how the distribution of data changes throughout the population ranges. F-tests can be then used to quantify the distributions differences. 
  • Determining which ranges of processing variables give the best results in comparison to a KPI.
  • Once this has been done, a T-test can be used to find out if the difference between the selected ranges will result in a statistically better KPI.



We can see that a Total reagent dosage greater than 8g/T has historically given a higher Gold Recovery


A relationship with size is less clear, but we can see that an increasing particle size is detrimental to recovery. With more data, a range of optimum particle size can be selected.


  • If looking for multi-variable interactions in order to discover how ranges interact across difference process variables, make sure ‘cherry-picking’ of certain modes or time periods isn’t occurring. (Setpoint Selector Feature coming to Clarofy soon to help with that.) 
  • Finding out which control loops need tuning: A great way to do this is to look at a certain variable setpoint (an SP or SV tag) against the corresponding error (difference between the actual value (PV) and the setpoint (SV)). It allows you to determine if the error changes at different SV ranges. 
  • Data reveals its non-linear maximums (if present) when using the correct binning size. This might be trickier to discover using scatter plots alone. This is also a good reason to try different bin sizes before confirming any hypotheses. 



Difficult With: 



  • If your data has low density (perhaps once a day or shift), trends may be difficult to uncover with binned scatter plots. 
  • Binned scatters do not visualise the amount of time or the amount of data points in each process variable. 
  • Unable to see different categories or modes of operation – view these on separate binned scatters. 



Tips & Tricks: 



  • Modelling out and integrating your overriding factors is key in extracting meaningful results. 
  • Experiment with bin sizes, you may find that your bins are too big or too small.  
  • Use T-tests to confirm your conclusions.  
  • Colour your points by a third variable to explore how co-correlations may be affecting your data. 



How useful have you found binned scaters? 


Have you ever been caught out by poorly selected bin sizes or outliers? 


This article series will focus on our key visuals, Interlate hope to share our experience to others and provide a robust understanding for their place in Clarofy our visualisation and data analytics application.  No software installation required and runs straight from your browser.