|
|
WattPlot™'s
Inaccurate Data
Storage
For a product which touts that it stores every single bit, and logs and
archives all
(emphasis theirs) the data produced by the Mate
display controller in its proprietary and compressed data file, we'd
think they'd make sure it really does that. Well, we believe in open
and interoperable software. That means we have to test that when we
process data from a WattPlot
program's output, we produce the correct data as well.
During a test of event data capture and analysis, and verification that
the greenMonitor
data agreed with what the WattPlot
software reported, we found that a significant amount of information
was missing. We can't claim to be compatible if we don't know whose
fault it is, so we set out to solve the mystery.
These are the events as reported by the WattPlot
software --
09/10/26 08:12:41 MX-3 Bulk 09/10/26 08:40:56 FX-1 Support, Using AC 09/10/26 08:40:56 FX-2 Pass Through, Using AC 09/10/26 08:41:02 FX-1 Pass Through, Using AC 09/10/26 08:41:08 FX-1 Charging, Using AC 09/10/26 08:41:26 FX-1 Pass Through, Using AC 09/10/26 09:42:06 MX-3 Silent
Next, we located the data for the time between when inverters resumed
"Use" mode and when the inverters had stabilized. There is often a
brief period of time during which the chargers will turn on, and we
wanted to verify that the correct data was being captured. The data
from that period was then saved to an EPD
file and converted into a
CSV file for viewing. What we found was that several records we
expected to see in the EPD file were missing.
Here is the CSV file that was produced by WattPlot
--
"FX","26 Oct 08:40 to 08:43" "FX Mode","Inverter Amps","Charger Amps","Buy Amps","Sell Amps","AC Input","AC Output","Batt VDC","Time" "Inv On, AC Dropped",2,0,0,0,126,120,48.8,08:40 "Inv On, AC Dropped",2,0,0,0,126,120,48.8,08:40 "Inv On, AC Dropped",2,0,0,0,126,120,48.8,08:40 "Support, Using AC",2,0,0,0,124,120,48.8,08:40 "Support, Using AC",3,0,0,0,125,123,48.8,08:40 "Support, Using AC",2,0,0,0,126,124,48.8,08:40 "Support, Using AC",2,0,0,0,126,124,48.8,08:40 "Support, Using AC",2,0,0,0,126,124,48.8,08:40 "Support, Using AC",1,0,0,0,125,123,48.8,08:40 "Inv/Pass Thru, Using AC",0,0,1,0,124,122,48.8,08:40 "Charging, Using AC",0,2,4,0,123,121,49.2,08:41 "Charging, Using AC",0,2,5,0,124,122,49.2,08:41 "Charging, Using AC",0,3,5,0,124,122,49.2,08:41 "Charging, Using AC",0,4,6,0,125,123,49.2,08:41 "Charging, Using AC",0,5,7,0,125,123,49.6,08:41 "Charging, Using AC",0,6,9,0,125,123,49.6,08:41 "Charging, Using AC",0,7,9,0,125,123,49.6,08:41 "Charging, Using AC",0,8,10,0,125,123,49.6,08:41 "Charging, Using AC",0,9,11,0,125,123,49.6,08:41 "Charging, Using AC",0,9,12,0,125,122,50.0,08:41 "Charging, Using AC",0,10,13,0,124,122,50.0,08:41 "Charging, Using AC",0,11,14,0,124,122,50.0,08:41 "Charging, Using AC",0,11,14,0,124,122,50.0,08:41 "Charging, Using AC",0,11,14,0,124,122,50.0,08:41 "Charging, Using AC",0,11,13,0,124,122,50.0,08:41 "Charging, Using AC",0,11,13,0,124,122,50.0,08:41 "Charging, Using AC",0,11,13,0,124,122,50.0,08:41 "Pass Through, Using AC",0,11,13,0,124,122,50.0,08:41 "Pass Through, Using AC",0,4,6,0,125,124,50.0,08:41 "Pass Through, Using AC",0,0,2,0,127,126,50.0,08:41 "Pass Through, Using AC",0,0,2,0,124,123,49.6,08:43 "Pass Through, Using AC",0,0,2,0,124,123,49.6,08:43 "Pass Through, Using AC",0,0,2,0,124,123,49.6,08:43
These are the events that we were trying to verify. The times are off
by about 7 seconds from what WattPlot
reported, and we think we know why. I'll explain at the end.
One of the records shown in the WattPlot
event log is missing from here. That's because the system was
configured with our unique feature to suppress quickly changing events
so the event log isn't cluttered.
10/26/2009 08:12:30: Charger 'Charger': The charger is now AWAKE. 10/26/2009 08:40:49: Inverter 'Slave': Operating mode is now 'Pass Thru' 10/26/2009 08:40:55: Inverter 'Master': Operating mode is now 'Pass Thru' 10/26/2009 08:41:05: Inverter 'Master': Operating mode is now 'Charge' 10/26/2009 08:41:17: Inverter 'Master': Operating mode is now 'Pass Thru' 10/26/2009 09:48:43: Charger 'Charger': The charger is now AWAKE.
This is the actual raw Mate
data. The greenMonitor
software comes with tools that can be used to extract a precise range
of records from a compressed Mate
data file. The range of times for these records, limited to the master
inverter in the pair, was 08:40:46 to 08:41:20, a period of exactly 34
seconds -- and you can count the records for yourself. There are 35 of
them. The first is 08:40:46 and the last is 08:41:20.
1,02,00,00,123,120,00,02,000,01,488,024,000,041 1,02,00,00,123,120,00,02,000,01,488,024,000,041 1,02,00,00,124,120,00,08,000,02,488,024,000,049 1,03,00,00,125,123,00,08,000,02,488,024,000,054 1,02,00,00,126,124,00,08,000,02,488,024,000,055 1,02,00,00,126,124,00,08,000,02,488,024,000,055 1,02,00,00,125,123,00,08,000,02,488,024,000,053 1,01,00,00,125,123,00,08,000,02,488,024,000,052 1,00,00,01,124,122,00,10,000,02,488,024,000,043 1,00,00,01,123,121,00,10,000,02,492,024,000,036 1,00,00,02,123,121,00,10,000,02,492,024,000,037 1,00,00,02,122,121,00,10,000,02,492,024,000,036 1,00,00,02,122,120,00,03,000,02,492,024,000,037 1,00,01,03,122,121,00,03,000,02,492,024,000,040 1,00,02,04,123,121,00,03,000,02,492,024,000,043 1,00,02,05,124,122,00,03,000,02,492,024,000,046 1,00,03,05,124,122,00,03,000,02,492,024,000,047 1,00,04,06,125,123,00,03,000,02,492,024,000,051 1,00,05,07,125,123,00,03,000,02,496,024,000,057 1,00,06,09,125,123,00,03,000,02,496,024,000,060 1,00,07,09,125,123,00,03,000,02,496,024,000,061 1,00,08,10,125,123,00,03,000,02,496,024,000,054 1,00,09,11,125,123,00,03,000,02,496,024,000,056 1,00,09,12,125,122,00,03,000,02,500,024,000,042 1,00,10,13,124,122,00,03,000,02,500,024,000,034 1,00,11,14,124,122,00,03,000,02,500,024,000,036 1,00,11,14,124,122,00,03,000,02,500,024,000,036 1,00,11,14,124,122,00,03,000,02,500,024,000,036 1,00,11,13,124,122,00,03,000,02,500,024,000,035 1,00,11,13,124,122,00,03,000,02,500,024,000,035 1,00,11,13,124,122,00,03,000,02,500,024,000,035 1,00,11,13,124,122,00,10,000,02,500,024,000,033 1,00,04,06,125,124,00,10,000,02,500,024,000,040 1,00,00,02,127,126,00,10,000,02,500,024,000,036 1,00,00,02,124,123,00,10,000,02,496,024,000,044
The records in italics are the ones that are completely missing. How
did this happen? We think the problem is the EPD
files themselves. We
figured this one out when working on the gmRemote
command for greenMonitor
v1.00.6. The WattPlot
software works under the assumption that records are produced every
second. The new gmRemote
command was accurately time stamping records when they were received,
and rotating the files according to the selected time interval, but if
each Remote_#.dat
file didn't contain the precisely correct number of records (according
to Intallact's
concept of "correct") the WattPlot
Monitor command failed to
process the files. So the gmRemote
command had to start lying and rebundle records according to what the WattPlot
Monitor command expected.
We believe that every now and again, the WattPlot
software goes "Oops!" and realizes it lost some time. And that's where
08:42 went to. "Oops!" It's probably also where the missing records
went to as well, but we can't be sure.
So how does time just get lost? The Mate™
Display Controller is not a precision time piece. It tries to produce
records once a second, but over the course of 86,400 seconds in a day,
even a tiny error can creep in. In addition, the Mate
Display Controller skips records when it receives a command or a button
is pressed. The EPD
files don't contain accurate time information, so
they have no way of accurately noting which records were lost and when.
When
a battery voltage isn't a battery voltage
If
you think this is merely an academic exercise, in December, 2009 a
problem with the accuracy of the FLEXnet
DC data arose. From time to
time, -25.6 amp values were being recorded for one of an OutBack Power Systems'
customer's shunts. I've seen similar odd behavior, and thought very
little of it. Until I reviewed the December production logs for my own
system and saw a 25.6 volt battery value, along with a 76.7 volt value.
Fortunately, because the greenMonitor
software does
store all of the data, I was able to locate the exact
records. With Version 1.00.6
approaching General Availability, I'd also been running WattPlot
to verify that the two products remained compatible. At 8:48 pm,
December 25th 2009, the FLEXnet
DC
on my system recorded a 25.6 volt reading. What did WattPlot
record?
"DC-4","25 Dec 20:48 to 20:48"
"? shunt","? shunt","? shunt","Batt VDC","Batt SOC","Batt Temp","Time"
0.2,-0.1,-0.1,51.2,100,15,20:48 0.1,-0.1,-0.1,51.2,100,15,20:48 0.1,-0.1,-0.1,51.2,100,15,20:48 0.1,-0.1,-0.2,51.2,100,15,20:48 0.1,-0.1,-0.2,51.1,100,15,20:48 0.1,-0.1,-0.1,51.2,100,15,20:48 0.1,-0.1,-0.1,51.2,100,15,20:48 0.0,-0.1,-0.2,51.2,100,15,20:48 0.1,-0.1,-0.1,51.1,100,15,20:48 0.1,-0.2,-0.2,51.2,100,15,20:48 0.1,-0.1,-0.1,43.1,100,15,20:48 0.1,-0.1,-0.1,43.1,100,15,20:48 0.1,-0.1,-0.1,51.1,100,15,20:48 0.2,-0.1,-0.2,51.2,100,15,20:48 0.1,-0.1,-0.1,51.2,100,15,20:48 0.1,-0.1,-0.1,51.2,100,15,20:48 0.2,0.0,-0.1,51.2,100,15,20:48 0.0,-0.2,-0.2,51.1,100,15,20:48
With WattPlot,
the problem is masked because WattPlot
does
not store the actual data,
in spite of their marketing
claims that they do.
The only way to store the actual data is to use the uncompressed OBM
file format, which consumes very large amounts of disk storage. The
data stored in the EPD
file is
inaccurate and incomplete.
What did greenMonitor
record?
12/25/2009 20:48:26|1,00,00,02,123,122,00,10,000,02,512,024,000,031 12/25/2009 20:48:26|2,00,00,01,124,123,00,10,000,02,512,024,000,033 12/25/2009 20:48:26|D,00,00,00,021,121,00,00,000,00,512,219,000,047 12/25/2009 20:48:26|d,0001,0002,0002,07,00098,512,100,000,50,15,101 12/25/2009 20:48:27|1,00,00,02,123,122,00,10,000,02,512,024,000,031 12/25/2009 20:48:27|2,00,00,01,124,123,00,10,000,02,512,024,000,033 12/25/2009 20:48:27|D,00,00,00,021,121,00,00,000,00,512,219,000,047 12/25/2009 20:48:27|d,0001,0001,0001,08,00219,256,100,000,50,15,100 12/25/2009 20:48:28|1,00,00,02,123,122,00,10,000,02,512,024,000,031 12/25/2009 20:48:28|2,00,00,01,124,123,00,10,000,02,512,024,000,033 12/25/2009 20:48:28|D,00,00,00,021,121,00,00,000,00,512,219,000,047 12/25/2009 20:48:28|d,0001,0001,0001,09,00206,256,100,000,50,15,097 12/25/2009 20:48:29|1,00,00,02,123,122,00,10,000,02,512,024,000,031 12/25/2009 20:48:29|2,00,00,01,124,123,00,10,000,02,512,024,000,033 12/25/2009 20:48:29|D,00,00,00,021,121,00,00,000,00,512,219,000,047 12/25/2009 20:48:29|d,0001,0001,0001,10,01197,511,100,000,50,15,093
There it is -- every single bit
of data, something Intallact
claims to do, but
doesn't. Using our compressed data format, each days'
timestamped, accurate and complete data doesn't require almost 17MB of
disk storage, as it would with WattPlot,
it only required 1.2MB, a compression rate of approximately 93%.
A
System Failure WattPlot
Missed
My own system is two-and-a-half years old now, and the cabinet
cooling
fan recently died. I've been busy working with clients on
their systems, as well as getting a bit of Spring lawn and garden work
handled here at the house and procrastinated on replacing the fan.
BIG
mistake. Renewable energy systems need to be cared for when
your monitoring software first reports an error, and perhaps I need to
treat myself more like a client of greenHouse
Computers
and less like the owner of the company. The result -- at
17:00:18 on March 26, 2010, the slave inverter again reported a
"Communications Error". The master inverter had stopped
communicating with the Hub
10
in the cabinet. This is not an error that should be ignored,
or that your monitoring software should fail to log.
03/26/2010
17:00:18: Inverter 'Slave': Operating mode is now 'Comm Error'
What did WattPlot
do? NOTHING!
10/03/27
00:22:32
** Communications timed-out but TCP link OK - will wait
10/03/26
17:03:23
TCP/IP connection closed by MATE server: 192.168.0.1
10/03/26
01:00:39
DC-4 WARNING: 'Midnight' reset at 01:00 AM - Check
MATE clock setting
It's hard for me to imagine that any software product which makes such
a big deal about "saving every bit" of data completely misses such a
critical error, but it did. WattPlot
can detect that I didn't change the Mate
for Daylight Saving Time (we don't make that check because our software
is network enabled and the time where the system is located can differ
by up to 23 hours from where the system is being monitored), but it
cannot detect that a complete system failure is about to happen?
That's simply unacceptable.
What did WattPlot
record in the EPD
file?
"FX-2","26
Mar 17:00 to 17:03"
"FX
Mode","Inverter Amps","Charger Amps","Buy Amps","Sell Amps","AC
Input","AC Output","Batt VDC","Time"
"Inv/Pass
Thru, Using AC",0,0,0,0,123,123,54.4,17:00
"Inv/Pass
Thru, Using AC",0,0,0,0,123,123,54.4,17:00
"Inv/Pass
Thru, Using AC",0,0,0,0,123,123,54.4,17:00
"??,
Using AC",0,0,0,0,123,123,54.4,17:03
"??,
Using AC",0,0,0,0,123,123,54.4,17:03
"??,
Using AC",0,0,0,0,123,123,54.4,17:03
There isn't even enough information to know what the "??"
means.
It turns out that it means WattPlot
has no idea when records are received, because the Hub 10
completely stopped sending data at 17:00:41, just 23 seconds
after the initial Comm
Error
was received.
What did greenMonitor's
gmServer
command record?
03/26/2010
17:00:18|2,00,00,00,123,123,00,92,000,02,544,152,000,048
03/26/2010
17:00:18|D,00,22,16,082,156,00,00,000,02,548,283,000,085
03/26/2010
17:00:18|d,0223,0215,0003,07,00099,547,099,000,50,37,144
The "92" is the value indicating a "Comm Error". And
if you
look, you'll see that there was no record for the master inverter.
In fact, over the next 30 seconds, the Hub 10
would slowly stop sending data to the Mate
display controller for the other devices in the system, first losing
the master GVFX
3648, then the FLEXnet
DC,
then the MX-60,
until only the slave inverter was reported, followed by it no longer
being reported either.
03/26/2010
17:00:31|2,00,00,00,123,123,00,92,000,02,544,152,000,048
03/26/2010
17:00:31|D,00,22,16,082,156,00,00,000,02,548,283,000,085
03/26/2010
17:00:31|d,0223,0215,0003,06,00003,547,099,000,50,37,128
03/26/2010
17:00:34|2,00,00,00,123,123,00,92,000,02,544,152,000,048
03/26/2010
17:00:34|D,00,22,16,082,156,00,00,000,02,548,283,000,085
03/26/2010
17:00:34|d,0223,0215,0003,07,00099,547,099,000,50,37,144
03/26/2010
17:00:35|2,00,00,00,123,123,00,92,000,02,544,152,000,048
03/26/2010
17:00:35|D,00,22,16,082,156,00,00,000,02,548,283,000,085
03/26/2010
17:00:35|d,0223,0215,0003,08,00281,547,099,000,50,37,138
03/26/2010
17:00:35|2,00,00,00,123,123,00,92,000,02,544,152,000,048
03/26/2010
17:00:35|D,00,22,16,082,156,00,00,000,02,548,283,000,085
03/26/2010
17:00:35|d,0223,0215,0003,09,00272,547,099,000,50,37,139
03/26/2010
17:00:37|2,00,00,00,123,123,00,92,000,02,544,152,000,048
03/26/2010
17:00:37|D,00,22,16,082,156,00,00,000,02,548,283,000,085
03/26/2010
17:00:37|d,0223,0215,0003,10,01532,547,099,000,50,37,131
03/26/2010
17:00:37|2,00,00,00,123,123,00,92,000,02,544,152,000,048
03/26/2010
17:00:37|D,00,22,16,082,156,00,00,000,02,548,283,000,085
03/26/2010
17:00:37|d,0223,0215,0003,11,01478,547,099,000,50,37,141
03/26/2010
17:00:39|2,00,00,00,123,123,00,92,000,02,544,152,000,048
03/26/2010
17:00:39|D,00,22,16,082,156,00,00,000,02,548,283,000,085
03/26/2010
17:00:39|2,00,00,00,123,123,00,92,000,02,544,152,000,048
03/26/2010
17:00:39|D,00,22,16,082,156,00,00,000,02,548,283,000,085
03/26/2010
17:00:41|2,00,00,00,123,123,00,92,000,02,544,152,000,048
03/26/2010
17:00:41|2,00,00,00,123,123,00,92,000,02,544,152,000,048
For nearly 30 seconds the Mate
display controller was reporting a "Comm Error" and WattPlot
never once reported the error.
Having robust data capture and playback is needed for problem
determination by system maintenance personnel. It's also
critical for your software developer to have this information so that
we can determine how systems perform in the seconds before complete
failure. At greenHouse
Computers,
we have the tools for you to detect these critical errors. At
Intallact,
they DON'T.
When diagnosis is impossible
We recently encountered an OutBack Power Systems
and WattPlot
customer that was having a problem with their FLEXnet DC battery
monitor. For some reason, the FLEXnet DC
was reporting a decline in state of charge over the course of the day.
The customer performed a fairly extensive data analysis on
the
shunt values and state of charge, then concluded that the FLEXnet DC
was using some incorrect model. I asked for the raw data,
assuming the customer had used that to produce the charts, and they
instead provided processed data.
After much struggling to convince the customer to send raw
data for analysis, they finally sent a CSV output file from WattPlot -- the
"raw data" format, as Intallact
calls it.
The important fields are missing from the FLEXnet DC
EPD file. The only data that are recorded are the three shunt
values, battery voltage and temperature, and the state of charge.
What I was looking for is the cumulative data for the shunts,
as
well as the total values. The problem is that WattPlot doesn't
record that data, and without that data it's impossible to determine if
the values are being computed correctly.
I decided that the only way to determine if their FLEXnet DC was
working correctly was to load the CSV file into a spreadsheet and add
up the values myself.
Several deficiencies in WattPlot
contributed to making the problem harder to diagnose than it would have
been with the greenMonitor
software. The first is a lack of complete system data.
The FLEXnet DC
EPD file doesn't contain the 14 "Extra Data" values that are provided
by the Mate.
The second is poor minimum resolution on the FLEXnet DC pen
plot. With a minimum resolution of +/- 500 watts, the FLEXnet DC pen plot
doesn't show systematic errors in shunt values very well.
Very low values for a shunt is shown in WattPlot as a fuzzy
line with poor resolution, but is shown with all of its detail on a greenMonitor
battery monitor plot.
I
eventually concluded that the customer's problems were caused by having
a too small battery bank, as well as a -0.3 amp error in shunt "A".
It was definitely a learning experience, and worth all the
time
and energy it took, even if the customer rejected my analysis because
of the conflict between Intallact
and greenHouse Computers.
When being more accurate isn't
One of the first features we helped Intallact
implement was the voltage and amperage truncation feature.
The
way they implemented the feature is a classic example of why you should
get your technology from the source, rather than the copycat.
The idea behind correcting for amperage truncation is simple
enough
-- the Mate display controller has a resolution of 1 amp for many
fields. Testing for a new compatibility feature we learned that
WattPlot isn't even keeping up with the daily kilowatt-hour value
reported by the charge controllers. Determined to figure out
the
cause of the discrepancy, we found that WattPlot is simply adding 0.5
to every value that is truncated to a whole number, regardless of the
recent data history for the device. WattPlot might as well
just
add 43,200 amp-seconds for each device at the end of the day -- 0.5
amp-seconds for each of the 86,400 seconds of the day the device is
operating. After
three and a half years of wrestling with this problem, we've released
two different solutions to the problem of inaccurate current
resolution. The first solution combines data from the FLEXnet DC
battery monitor and inverter to calculate the inverter output.
The second solution uses third party products to measure input
and output AC current. Whatever is left over must be inverter
current. With these solutions we are able to detect current
changes as small as 0.05 amps AC, 20 times more accurate than the AC current resolution of WattPlot.
|