Saturday, November 24, 2012

Debugging the garage door remote

A while ago I built a little remote to open my garage door (well, not actually my garage door but the one of the house I live in, but who cares). But the greedy kid I am, I wasn't satisfied with just opening the garage door with it. After all it has four buttons, and so I planned on also using it as a remote for my yet to be built home automation system (a few parts of it already exist, but there is a lot of work to be done before I can really call it that).

The problem was that it had a nasty bug which was really hard to track down: It would work as intended for some time (sometimes hours, sometimes even days), but then, all of a sudden it would hang. And not only did it seize to work, it also stayed in some state where it consumed lots of power and ate up the battery in no time. This was far from usable in real life.

I had heard of a little thing called watchdog timer, which could reset the MCU if it hung. So the first thing I did was to add this watchdog timer to my code. Sure enough that didn't fix my problem, so I tried everything I could think of from incrementing counters in EEPROM after every "real" instruction in the code to hooking up oscilloscopes and logic analyzers. But the only thing I found out was that it somehow, sometimes would get stuck in some weird kind of a reset loop.

When I ran out of ideas of what else to try I started badgering people on various internet forums. And after trying all sorts of changes to the code that people suggested, someone came up with the idea of deliberately causing a reset (among a few other things). I implemented that idea in my code and what happened was, that now every time I caused the reset, the MCU would hang. That may not sound like a good thing at first, but at least now I could reproduce the bug!

After that all it took was a bit of searching the web until I found out about a register called MCUSR, the MCU status register. Turns out that if the watchdog wants to reset the MCU, it writes some value to this register that would cause the MCU to perform the reset. But for some reason it doesn't get cleared after the reset is done, causing the MCU to keep resetting itself until the battery is dead or the world ends, whichever comes first.

So in the end all I had to do was to set the MCUSB to zero right at the beginning of the setup() routine and all of a sudden the reset would be performed as intended and the MCU would work again. For now. I'll have to give it a few weeks to really call this case closed, but I'm pretty confident that this was it.

All that's left to do is to say a big THANK YOU to all the people who tried to help me solve this mystery, and especially JohnO on the Jeelabs forum, who came up with the idea of causing that reset on purpose (although he might have had something completely different in mind than crashing the MCU every time the reset fired).

The updated code can be downloaded here.

Friday, November 23, 2012

Building Arduino Sketches without the IDE

I really love the Arduino, because after all it was what started all this.

The one thing I don't like is the IDE that comes with it. It's not the IDE's fault, It's just that I'm a console guy and I don't really like IDEs in general (and yes, I've seen others and I think this one is not particularly good). Luckily there are ways around it, like Makefiles that automatically include all the needed libraries and even upload the sketch to the board. There are several of these out there, and I went with this one which is a slightly modified version of this one. The Makefile is pretty good as is, but I added a few minor changes:

  1. The Makefile can search for libraries in ~/sketchbook/libraries, but instead of the ~/sketchbook folder I use ~/src/arduino, so I just replaced that.
  2. It reads the boards.txt file in the Arduino installation directory but I have my own boards.txt in ~/src/arduino, where I added my own boards, so I changed that path, too.
  3. The setup howto suggests that you symlink the file to your sketch-directories, but if you do that, you'd still have to set the environment variable $BOARD to the right type, which I found a bit tedious. So instead of symlink it, I create a new Makefile in each sketch folder that contains only two lines: The first one sets $BOARD to the board type that sketch uses, and the other just includes the original Makefile.
The result is a very tidy build system with no redundancies whatsoever.

As usual everything can be downloaded in my github repository:


Sunday, November 11, 2012

Smartmeter III - blinking LEDs

This is a short one. At least compared to the usual walls of text I post.

Due to the nature of my water heater (a continuous-flow-type), it's power consumption is highly dependent on the water pressure and flow-speed. The more and faster the water flows, the more power it consumes. Apparently this happens in three stages. The first two turn on immediately as you turn on warm water. But the third will only com on if needed, i.e. if the water consumption continues to increase. Also these three stages are connected to three different mains lines, which means, they get counted separately.

Saving energy, resources and money is fine, but sacrificing creature comforts for it? No way!
When I take a shower I want it hot. And I don't want it trickling down on me, either.

That's not a problem with my new water (and thus electricity) saving shower head. What is a problem, though, is that the water flow has to be adjusted just right to keep the third (and greediest) stage of the water heater from turning on.

So to be able to see what the water heater is up to, while in the shower, I built this little thingy:

It's just a Jeenode with a few LEDs

The yellow, green and white LEDs blink if there's a pulse received for one of the three mains lines, and the red one will blink if a packed got lost. Right now it's powered with a single AA battery, but that lasts only a couple of days, so I'll have to add a little power supply, probably from an old phone, since I have plenty of those.

The code to drive it is here:

https://github.com/alibenpeng/powerblinker.git

Saturday, November 10, 2012

Smartmeter revisited

Einstein once said that only two things were infinite: The universe and human stupidity. The former even he wasn't sure about, and the latter I found out the hard way. But I'll start from the top:

I've had my smartmeter running for a while now and since the readings looked plausible I assumed they were accurate. Well, they weren't.

The meters I use generate 2000 pulses of 90ms length for every consumed kWh, and the way I counted them was to collect them for five seconds and then transmit a summary. After a while I found out two things about my power consumption:

  1. The water heater (a continuous-flow-type) consumes an insane amount of power, and
  2. The computer running the metering software also was a bit too greedy for my taste.

I couldn't do anything about the water heater because it's installed permanently and after all it's not even mine, but I could try and tame it's power consumption by reducing the water pressure and/or the flow speed by using a water saving shower head. As a side effect I'd also reduce my water consumption. Two birds with one stone! And the PC running the metering software also had to go.

To get rid of the PC I had to rewrite the metering software from scratch, because the software from the volkszaehler.org project I was using is based on a full-fledged LAMP-stack (Linux, Apache, MySQL, PHP). The main problem was the MySQL-database, as this keeps every reading I ever took and has to access them all at once every time I'd wanted to view my consumption graph (remember, I took readings every five seconds, so there were a LOT of values in there).

I opted to use something called RRDTool instead. This is a nifty little tool, well known to sysadmins around the world to keep stats about routers and switches, system loads and whatnot. Since there are usually a lot of data sources too keep track of, the databases have to be kept small. The way it achieves this is to use fixed size databases of different resolutions and aggregation functions to consolidate older data.

While I was at it, I also changed the way I collected the pulses from the meters from the five-second-summary-method to transmitting every pulse as it came in. This is where things started to go South...

Before...


But the really stupid thing I did was changing something that drastically influenced my power consumption (the shower head) and the way I collected and processed the consumption data at the same time.

The result was that I grossly overestimated the savings from changing the shower head and thought I'd cut my consumption by about 50%, since that was what my new readings suggested. I never even bothered to check the "official" meter from the power company, which finally bit me in the ass when the next power bill came in.

...after...


Only then did I find out that my readings were way off! They may have looked plausible, but weren't anywhere near my true consumption.

After this wake up call I started to investigate the error. At first it looked like the new way of counting pulses was to blame, so I changed that back to the old way. That gave me way higher readings but by then I had grown suspicious if those were correct. They also weren't.

So the new method was obviously loosing pulses while the old way was counting pulses that weren't there. The one thing both had in common was the way they went into the controller:

When idle the controller was in sleep mode and an incoming pulse would cause an interrupt to wake up the controller and send out a packet or increment a counter, respectively. Since the driver for the RFM12 transceiver is also heavily based on interrupts I figured that might have been the problem. I also had some code to blink an LED using the Arduino delay() function, which is a really bad idea when using interrupts at the same time, but I threw that out first and it didn't fix the problem.

So I changed the code to poll the inputs instead of sleeping and waiting for interrupts and that seemed to help with the lost pulses.

But things only got weirder: Now some pulses were counted twice. So I started counting milliseconds between pulses and it turned out there was some kind of bouncing about 75ms after some pulses, while others looked OK.

I knew that mechanical buttons tend to bounce, but this bouncing would occur in the first ten ms after the button was pressed, not 75ms later! 75ms is close to eternity, even for a tiny and slow 8 bit microcontroller. A real computer could go fetch a coffe and have a smoke in 75ms! Also there weren't any mechanical switches anywhere near the circuit I've built. The pulses are generated digitally by the counters, go to an optocoupler and straight to the controller.

But I didn't want to rip the whole thing out of my breaker box and dissect it with an oscilloscope, so I just wrote a little routine that ignores every state change that occurs less than 90ms after the input went low. Think of it as a really slow debouncing routine.

and the real thing (hopefully)


So far my readings look OK, there still is some 0.03kWh error per day, give or take, but that's nowhere near what my old code produced.

As usual, the code can be found in my github account: