ArticlesIndustryInterlate News

Exploration Visuals – Scatter Plots


Everyone knows the scatter plot, it is the default. But it shouldn’t be for everything. Great for discovering and exploring relationships but only sometimes the best for getting your point across. Below we share the best applications for scatter plots and our tips and tricks for getting the most out of them.


This is article 3/7 in our series focussing on the main visualisations our analytics team use.

To receive a copy of the whole series now go to


Good For:

  • Discovering and exploring relationships
  • KPI vs Process Variables. e.g. throughput vs. P80; recovery vs. concentrate grade; feed metal tons vs collector dosage.
  • Identifying the dominant relationship in the plant such as feed grade vs recovery.
  • Visualising relationships with only two or three relational factors. Such as throughput, P80 and power draw of major equipment.

Difficult With:

  • Drawing actionable conclusions, calculating valuations, and finding best operating ranges.
  • Very high-density data normally found when looking at minutely/secondly data or a few years of data
  • Variables vs. setpoints as they can often stacked on top of each other.
  • Discovering more subtle relationships in the plant (often over-shadowed by the strongest relationships).
  • Multi-dimensional factors such as float recovery vs a combination of level, air, feed grade, con-grade etc.

Tips & Tricks:

  • Using colour as a third dimension to get information on multiple interactions.
    • Colour by the timestamp to see if there are general trends over time
    • Colour by Ore body to see if there are trends within the ore bodies
    • Colour by operating mode, or flowsheet configuration
  • Often the presence of outliers can make a trend seem more prominent than it is, or may even give an incorrect trend; as shown in the graphs below from Clarofy

Before filtering the gradient of the trend line is slightly negative, with the cluster of points at zero reagent dosage influencing the relationship.

After filtering out the zeros and the outliers above 12gpt; the gradient of the trend line is shown to be positive with recovery.


  • When analysing continuous plant data, understanding of the process is important. The practical meaning of the outliers (in the example above: zero reagent dosages) is not available in the data and regular conversations with operators can shed light on the following types of events:
    • Start up and shutdown times (filter by throughput to remove this error)
    • Instrument error (check pv vs. sv visualisations, met accounting vs plant data, any notes about instruments)
    • Plant events such as trials, mill re-lines, equipment commissioning, and natural variation in operator behaviour, control philosophies, equipment limitations.
  • Break up the scatters into groups if a large amount of data is available: it is often practical to break into yearly or 6-monthly sections as it is unlikely that a plant would operate with consistent tactics for that amount of time. Another method of categorising could be different ore bodies being treated, if the data is available, or different flow sheet configurations. Data points can be coloured by these modes, or they may be examined separately to discover trends. Be aware of trends across groups masquerading as trends within the group!
  • With high density data the following techniques work well to get more out of your data at this early exploration stage:
    • Aggregate data over a period that makes practical processing sense for the equipment you are examining. e.g. In the case of a thickener; 1 hour, or even 6 hour aggregations would be appropriate, but on a single float bank, 2-3 times the residence time (usually minutes) is more fitting.
    • Examine scatters for each month, quarter or year if a time trend hypothesis exists. It might not yet be clear why that relationship exists, but this kind of view can offer more information on the nature of the association.
    • Use a density scatter plot (a variation of a standard scatter) that will colour based on the density of the data points in the area.


This article series will focus on our key visuals, Interlate hope to share our experience to others and provide a robust understanding for their place in Clarofy our visualisation and data analytics application.  No software installation required and runs straight from your browser.