Ben’s Updates: Honors Shark Wrangling

Hello, everyone!

I’ve decided to make my work on the Sharkduino project – from now to December 2017, anyway – my Computer Science honors thesis under the sage tutelage of Dr. Gang Zhou, an associate professor at the college of William and Mary. Dr. Zhou has experience in embedded programming, the Internet of Things, and body sensor networks, so he’s a natural fit for our work. I’m glad to have him on the team.

Winter break was long and boring, but the fun parts of it were spent working on William’s additions to Arduino v2, specifically the DS1339 real-time clock and FXAS21002C gyroscope. The former had a comprehensive but somewhat bulky library already available, so integration was fairly simple – strip out the old code and throw the new code in. The latter was more complicated.

We’re mainly using the FXAS to reduce power consumption, as the gyroscope uses more power than any other sensor. However, the FXAS also has an internal FIFO buffer to store data, just like the previous L3GD20 gyroscope. I’d already implemented the L3GD20’s FIFO functionality into an existing Adafruit library, so I felt confident that I could do it again. The problem was that there was no official Arduino code for the FXAS; the only library we could find turned out to have significant errors. Ultimately, I ended up writing my own library, which was a fun and elucidating experience. I plan to publish it on GitHub once I test it more.

I’ve got a few minor things on the to-do list, but my immediate future can go down one of two paths: I can work on a new accelerometer with a FIFO buffer similar to the gyroscope’s, or work on selectively disabling the gyroscope during periods of low activity. Either way, I’ve got an exciting semester ahead of me.


Ben’s Updates – 2 Weeks to Touchdown

Hey, guys. If you’ve been reading William’s blog, you know that we were at the VIMS Eastern Shore Lab last week, tagging sharks and generally having a great time. We’ve already got about 15 days of accelerometer-only data from some other tags as well as one (soon to be five) day(s) of data from our own, so we must now analyze the data and see what we can get out of it. However, since I skipped last week’s post, I ought to bring everybody up to speed on what I’ve been doing – it has indeed been a real doozy.

Burst Gyros

After finally getting through the burst-gyro reads debacle, I pulled out the oscilloscope to check for improvements in power consumption – and there were indeed improvements. I was now pulling data from the gyroscope at a much slower rate, so power consumption from that was reduced by a factor of about 32. Unfortunately, this activity probably took up about 10% of our power use, and as Amdahl’s law states, a big increase in efficiency in a small part of the system is not that big in the long run.

The bigger reduction comes from the fact that SD writes are now 1.5 times less frequent. The SD takes up about 75% of our current draw, so this is a decent improvement. However, as I’ll later show, this improvement was quickly overshadowed.


I also decided to give the accelerometer a little TLC through some bit-packing. Bit-packing is a way of using some bit-level manipulations to efficiently “pack” data. Programmers think of the “memory” for a program as a giant array of “words”, with each word having its own address. For the Arduino, each word is just a byte (we call this system “byte-addressable”), and each byte has 8 bits.

For example, we could store a char (1 byte) at address 0x100. We store larger values across multiple bytes, so a short goes from 0x100 to 0x101. However, we can’t naturally have values that start or end in the middle of a byte – the idea of storing a char between 0x100.5 and 0x101.5 doesn’t make any sense. A 12-bit integer will therefore take up the same amount of space as a 16-bit one, and the unused space is known as “padding”.

However, with some clever bit-level manipulation, we can in fact store values in between bytes. This is the original structure for storing accelerometer data:

struct accel_data {
    short x: 12;
    short y: 12;
    short z: 12;

Like I said, each 12-bit integer takes up 16 bits of space, so there are 12 bits of padding. In other words, 25% of our buffer was just wasted space. To get around this, I split each x, y, and z value into its lower 8 bits (“least significant byte”) and upper 4 bits (“most significant nibble”, nibble meaning 4 bits). The result is a “pack” that combines two accelerometer reads with no padding:

struct accel_pack {
    // These are the lower 8 bits of each data point.
    // This part of the struct takes up 6 bytes.
    struct {
        byte x1, y1, z1;
        byte x2, y2, z2;
    } lsb;
    // These are the higher 4 bits of each data point.
    // The compiler will pack this into 3 bytes for me.
    struct {
        byte x1: 4; byte x2: 4;
        byte y1: 4; byte y2: 4;
        byte z1: 4; byte z2: 4;
    } msb;

Astute readers might be wondering why the bytes in msb are laid out with no padding. In truth, the C compiler is adding some behind the scenes code. x1 and x2, for example, are stored in the same byte. If I want to read from x1, it has to isolate x1 and copy it into a different byte before I can play with it. If I write to x1, it has to avoid overwriting x2 in the process.

So, with that massive text dump out of the way, what are the actual gains from this? To be completely honest… nothing, at least not yet. The tag currently bit-packs perfectly, but in order to take advantage of that I need to precisely set a few constants. Since one of my future goals is to organize the various constants floating around in my code, I’m going to put that off for just a little bit. Hopefully, I can get back to you guys on that later.

Binary SD Writes

I’m way over limit right now, and I haven’t even gotten to Shark Week yet. But this next part is pretty dang awesome, for two reasons. First off, I made a new file format, and second off, I may have dropped the SD card’s power consumption by a factor of eight.

Up until now, we’ve been writing to the SD card in plain-text. This was very convenient, and it’s standard for many tags, but it’s wasteful. While the SD card is open, every CPU cycle counts, and the Arduino needs to convert every single sensor value into a string, write it to the SD card, do a lot of looping and branching… it’s tedious.

The alternative is to just write the raw data in binary and parse it into plaintext from our computers. I had tried this many months back, but it seemed to fail – it actually took longer to write to the SD card than the plaintext method. I still don’t know why that happened, but I tried it again on a hunch and found something amazing.

The Arduino originally reported that it took around 400ms to write to the SD card. Now, it took 50. I checked and re-checked the data, and it all came out right. I can’t explain how satisfying that is, but it’s pretty satisfying.

I ended up writing my own file format for the tag, called .SRK for obvious reasons, and a parser in C++ that converted the data into a CSV. That took the better part of a week, thanks to a myriad of small errors, but the end result is pretty good, especially the parser. Since I’m potentially working with Windows users who might not have C++ compilers, I might rewrite the parser later in Python or figure out cross-compilation, but it was a fun exercise and will do for now.

Overall – and this is a bit sketchy – Amdahl’s law and my oscilloscope tell me that the combined optimizations reduced the tag’s power consumption by a factor of 5. We haven’t had time to measure the tag’s battery life, but it’s at least 40 hours on a 450mA. We can thus estimate that the tag will now last for over 200, which was what we were aiming for at the start of the project.

Shark Week

After all that, I finally got to what I’ll call “Shark Week” – the time William and I spent as interns per se at the Wachapreague Eastern Shore Lab. A lot of what went down is hardware-related, so you should check William’s blog for that, but I personally had a lot of fun. We managed to get our tags in the water, didn’t hurt any sharks, and have quite a bit of data to analyze. The question now becomes how to analyze it.

It’s inconvenient to constantly record sharks. Even if you stick a GoPro in the water – sharks think electronics are food due to electrical impulses, so you need to be careful – you need to scrub through days of footage later. In lieu of that, we decided to get a bit of important video data and a feeding schedule for reference. We can then see to what extent (and how) our machine learning algorithms can reliably distinguish between feeding and non-feeding patterns. We can also do similar thing with day and night or other factors. This is what I plan to spend my last two weeks doing. I’ll tell you how it all goes.

See you then.

Ben’s Updates – Week 3

This is a cross-post from this site.

Yup, shark wrangling is mostly just data analysis and, for now, signal filtering. I’m not the best physicist, but right now, William’s working on the PVC enclosure we need to deal with the briny depths, so I’m on math duty. The upshot of this is that I’m working on two separate things right now, one (seemingly) simple and one (seemingly) complex.

My first goal is to use the accelerometer readouts to estimate the pitch and roll of our tag. When the tag’s not moving, this is extremely simple; we use a bit of trig to to find the angle between the current gravity vector and our theoretical reference frame of [0, 0, -1] g’s (tag face-down). The problem is that A) this is really susceptible to system noise (like sea currents or, you know, a moving shark) and B) there are a lot of edge cases to consider that my feeble CS mind cannot easily comprehend, like Gimbal lock.

Alongside the accelerometer, we’ve got a gyroscope, and unlike our janky accelerometry trig, we can just integrate the gyroscope’s angular velocity readings to get angle. The only reason we need an accelerometer is to correct for gyroscope drift – since we’re integrating, noise is cumulative, and gyroscope readings can be unreliable in as little as 10 seconds. To correct this, we’re going to use a Kalman filter, which combines both readings in a manner that removes drift and other noise as well. It’s pretty cool, and while it seems horrifying at first, there are a lot of sites out there that offer a gentle (or simplified) introduction. If you still don’t want to use it, you can use a complementary filter – it’s one line of code and takes up no extra RAM if you’re working with an embedded system.

To help with both of these problems and because I was bored of friggin’ MATLAB scripts I wrote a Python script that produced simulated tag output! It’s actually fairly useful to see how and where my work is going horribly wrong. If anyone wants to have a tag simulator, I could spruce it up a little and send it your way.

Be seeing you.

Ben’s Updates – It’s All Downhill From Here

So I can’t even begin to figure this out, but despite writing less data and (supposedly) using less CPU, the binary format appears to use more power than the plaintext one. On the plus side, I didn’t get that far into it, but on the far more significant minus side, I’m probably not doing any more work on the tag in the near future. All the sensors have been accounted for in code, and any further optimization is low-priority.

Now, on one hand, that’s great. The tag is “done”, in some sense or another. On the other hand, the work to come after that is in A) getting rid of gravity and rotating the accelerometer data into the shark’s inertial reference frame, and B) machine learning. B) could be interesting, but right now I’m saddled with A), and since I don’t know that much physics, it comes down to copying other peoples’ solutions and hoping that they were right.

In any case, as I understand it, my job is to guess the pitch and roll of the shark… wait, hold on a second.

I had to look it up too.

Anyway, I have to guess the pitch and roll of the shark from the accelerometer (with a bit of trig) and the gyroscope (with a bit of integration). The reason why I do both is because each sensor has a certain amount of error that, no matter how small, can accumulate to massive drift very quickly. I can use a Kalman filter, which takes the estimated drift and predicted pitch/roll into account, to get a more accurate measurement than either would alone. After that, I can figure out how the shark was oriented at a particular time and take out the gravity.

Hopefully. I’m probably missing something.


Ben’s Updates – Headway

First, an announcement:



These specific sharks, not sharks in general.

That’s right – Dr. Weng has caught a few sandbar sharks for us to ethically experiment on in the future. They’ll come in handy about a month from now.

Second, with the advent of the summer, I’ve started to work on the Sharkduino project full time, eight hours a day, for real cash money. It’s been really fun so far – I’ve had a lot of time to work on things that I wouldn’t be able to get to otherwise.

So what have I been doing?

The tag we’ve been using has stayed largely stable in terms of features (although we’ll be adding a few new sensors soon), so I’ve mainly been re-organizing things to minimize memory use and maximize buffer size. My first task was to ditch my stupid (and honestly over-engineered) class architecture and just use C-style separate compilation. I was also able to move my debugging strings into PROGMEM to save a few bytes here and there.

My next step after this, though, is to move our datalogging from a (slow, clunky) text format to a (fast, efficient) binary file. That will be a lot harder; I need to make a new file format to hold the data and a script to convert it back into plaintext.

I’ve also started looking at ways to analyze the data we get – I essentially want to see what a shark is doing (eating, sleeping, mating) when. To do that, we need machine learning. Basically, we make a classifier out of a set of training data that takes features of our data (mean, variance, energy) and guesses what the shark is doing.  My current patchwork methodology is coming from a study on humans, which you can find here.

I’ll try to keep my posts short and frequent this summer. See you guys next week!