/ Smart DataPart IIInutritionData Analysis


In the first part of this series (Small data - Big opportunities) I rebelled against big data and established that you need more than just data, namely the knowledge what to track, how to use it, how to integrate feedback, and of course an audience (that generate the data). The second part (Small data - Tracking food data) showed these principles (excluding the audience principle) at the example of tracking calories for weight-control.

And now, I conclude with the missing parts: audience and an overview of my implementation.


As already said, my reason to begin tracking calories was to feel better about myself and to be healthier by loosing weight.
At the beginning I used fddb.info, an online tracking service. But I ended up implementing an application on my own due to lack of automation, data storage capabilities and the following further requirements:

  • Saving my data for as long as I want in order to analyze it, and to always remember what I have achieved
  • Placing recurring items automatically, i.e. I eat a cottage cheese mix every day before going to bed
  • Food planing capabilities for the upcoming week
  • Placement of daily items based on a nutrient breakdown, i.e my breakfasts usually consists of a few basic items with different nutrient profiles which are placed depending on the meals of the day
  • Shopping planning based on the needed food for the week
  • And finally, I am very curious about any new findings :)

How much data can one individual generate? My estimation for a week, which has 7 days; every day has generally 4 meals; every meal consists of 3.5 parts: a minimum of 2 parts (noodles and chicken) and up to 5 parts (for example rice, chicken, vegetables, sauce, some fruits as desert) => 7 x 4 x 3.5 = 98 parts per week, or roughly 5000 parts per year. It’s by no means big data, but definitely enough for smarter decisions compared to no data.


How does my application work? The program is is implemented in Python. It is command line based (excluding one part with a tk gui), uses text-files as data-input, and Sqlite as database — basically a bunch of scripts to get fast results, but are now in dire need of a rewrite.
Let’s begin with calculating the required amount of calories.

Calculating the bodyfat percentage and to-consume calories

In order to know my calorie target, I need to know how much fat I carry around, to get to my lean body mass (LBM). Consumer methods are unreliable, therefore I take the average of three methods:

  • scale with bodyfat „measure“: delivers a percentage
  • Navy method: different for men and woman; all measures in cm:
    • men: \(86.01 * log_{10}(abdomen - neck) - 70.041 * log_{10}(height) + 30.295\)
    • woman: \(163.205 * log_{10} (waist + hip - neck) - 97.684 * log_{10} (height) - 104,912\)
  • Caliper: measurement of the skin thickness at different points on the body. Again different for men and woman; and different methods (number of points to measure). I measure three points, take the average and check in a table.

Measurement entries look like this

    weight: [84.8]
    bodyfat: {"scale": 30.8, "navy": [39.5, 94], "caliper": [14, 16, 16]}
    tdee_adjust: -22

Based on LBM we can get the basal metabolic rate (BMR, stay-alive calories); together with an activity level we get the total daily energy expenditure (TDEE, stay-alive-and-do-stuff calories). Consuming TDEE calories should now maintain the weight. But this value needs oftentimes adjustments, either for loosing or gaining weight, or just because it doesn’t correctly fit with the correct TDEE needed to maintain.

\text{bodyfat percentage} = (Scale + navy + caliper) / 3\\
LBM = weight * (100 - \text{bodyfat percentage}) / 100\\
BMR = 370 + (21.6 * LBM)\\
TDEE = BMR * activity level\\

Finally the calorie target we should consume on average per day (ATDEE) can be calculated:

ATDEE = TDEE * (100 + adjustment) / 100

Calories are then distributed to macro-nutrients that should be consumed. My personal settings are to reach 2-3 gr protein per kg LBM, 1 gr fat per kg LBM, and the rest as carbs.
Example: 2083 kcals would result in something like 200 gr protein, 200 gr carbs and 80 gr fat.

Calculating consumed calories

It doesn’t matter to know what you need to consume, if you don’t know what you have consumed.

The data about what have been consumed is also stored in the database, with text files on a weekly basis. Dailies entries look similar to this:

	- breakfast:
	- dinner:
			servings: 2
	- lunch:
			kcals: 320
	- night:
			gr: 25

You can specify the amount in grams (gr), specific kcals or servings. Every consumed item needs an equivalent in the foods table, which has the macros and serving_size for every item,for instance:

	f: 0.1
	p: 0.3
	k: 11.4
	serving_size: 150

Together it is easy to calculate the total calories and macros for the day. Every serving and kcals entry gets converted to gr, then just loop over every item, get its macros (f, p, k) by multiplying gr with the macros per 100 gr, and sum everything up:

total_f, total_p, total_k, total_kcals = 0
for every consumed item:
	macros = item.gr * (macros_per_100_gramm) / 100
	f, p, k = macros
	total_f, total_p, total_k += f, p, k
	total_kcals = f * 9 + p * 4 + k * 4

.. or using a matrix multiplication.

A calculated week looks like this:


Yellow numbers mean this particular amount is too low in regards to the targeted (red in contrast means it’s too much).

Week planning

As already mentioned, having 100 items per week is possible. Always typing or copy-pasting this would defeat the purpose of easier tracking and automation. Therefore, the week planing is three-fold:

  • a GUI to place the lunch and dinner meals,
  • general repeated items, that are always included, (eg. apple in the morning)
  • items, that are placed based on some constraint (eg. place the highest protein item on the day with the lowest protein).


The result is a file containing the food items for the week, which can be further fiddled manually, if desired.



I believe that deployed online and used by many, my tool might help people reach their respective weight and health goals.

Besides the obvious tracking parts, and a big database of consumable items a service must have, there are many possible features derived from the data I have yet to see in a service:

  • proposals for better food alternatives based on other peoples similar taste
  • personalized TDEE calculator which uses the actual consumed items and weight fluctuations
  • much better visualisations of possible future progress based on past experiences and other similar persons

But data and forecasts do nothing without compliance to the plan. A good community of like-minded people and shared goals are potentially more important than the data they generate.

However if you would like to use my script locally you still can use general scientific concepts implemented as described. The only missing feature from my initial list is the grocery list which wasn’t really needed in the end. Food is stocked for at least a week at normal consumption rate and bought when it would get low the following week.

As always there is always a potential for improvements. For instance food items whose macros depend on the date. The cross-referencing between the consumed lists and the database depends on the names. Over time, the macros of some items change; either due recipe changes, other brands are used, or the default serving size changes. The workaround at the moment is to add new items with similar names, and explicitly stating the serving size every time.

And may be a GUI? I do think that this could reduce some friction (spelling mistakes, better search, live macro overview while planing). But this would need a well thought out gui. As someone without experience in user interface design and implementation, this is pretty high on my vacation learn list. Maybe after it you’ll see another post from me about designing a gui for food tracking.

Thanks for reading!

    Christian Seyda

    Christian Seyda

    Software engineer working on mass data backend challenges. He studied computer sciences with focus on data mining and care to implement applications to support healthy living.

    Read More