They are better than sand replacements in my own experience. I too have over 26 years experience, although we did not get gauges until later in the 80's. When I get the gauges calibrated (usually 2 or 3 on a job at one time in case 1 goes down) I always ask for the R² value to be included with the calibration. Where this is outside of 95%, re-calibrate the gauge, get a different one and/or look at the material under assessment and see what could be causing the big errors in repetitive readings. (PFA, Ash, slag, organics, some rock types like granites etc…)
Always have at least one 'other' density measurement technique done per day of earthworks filling, and normally with clays this would be a core cutter every 5 to 10 NDG readings and run it on a rolling calibration. More difficult with granular soils ( I have very little faith in the SRD test), so would normally have a spec requirement for calibration boxes to be done on a regular basis. Where stabilised soils are being tested, this would probably be one per day. Then use the data from the calibration box as a rolling assessment. I also plot the results of the data on a rolling basis to see if there are any trends with the data which could indicate the gauge is giving erroneous readings.
One big point to note is that there are a LOT of older gauges out there at the moment due to the cost of de-commissioning them. From my own experience lots are still in circulation that should not be and a number have been ‘given’ away to other test houses who have less scruples. So when I start to doubt the readings one of the first questions is “how old is the gauge’ and can I look at the ‘daily checks’.
What you do get with a gauge is a lot of data in a short space of time. How much reliance you place on an individual reading will be a reflection of your own confidence in the kit, technician, material, and how critical the data is.
When I write specs, I always ask for additional in-situ assessments to be done at each and every NDG reading, which is normally either a set of hand vanes or mexe probes so I can make a judgement as to the value of the data. I would never rely on an uncorrected gauge in isolation (just writing a review on someone else's work where they have done this, and the numbers just don't stack up!!).