Image Image Image Image Image Image Image Image Image


24 May


First steps with AI & image recognition (using TensorFlow)

May 24, 2017 | By |

After reading the excellent O’Reilly book/essay collection What is Artificial Intelligence? by Mike Loukides and Ben Lorica, I got curious—and, finally, emboldened enough—to give get my hands dirty with some n00b level AI and machine learning.

Pete Warden’s Tensorflow for Poets, part of Google Code Lab, seemed like a logical starting point for me: My coding skills are very basic (and fairly dismal, tbh), and this is technically way beyond my skill level and comfort zone. But I feel confident that with a bit of tutorial-based hand-holding I can work my way through the necessary command-line action. Then, later, I can take it from there.

For this first time I would stick to the exact instructions, line by line, mostly by copy & paste. It’s not the deepest learning curve that way but it helps me to walk through the process once before then changing things up.

So, basic setup. I won’t include links here as they’re updated and maintained over on Tensorflow for Poets.

Get Docker up and running

Docker creates a Linux virtual machine that runs on my MacBook Pro. This created a first small bump, which after some reading up on Docker configuration turned out to have the oldest solution of all of tech: Relaunch the Docker app. Boom, works.

Get Tensorflow installed & download images for training set

As I continued the setup and installed Tensorflow as per instructions, there was some downtime while the system was downloading and installing.

The tutorial suggests experimenting with the Tensorflow Playground. Which is great, but I’d done that before. Instead, I decided to prepare my own set of images to train the Inception model on later. (After first following the tutorial exactly, including using their flower-based training image set.)

The training set consists of flower photos for five different types of flowers, and a few hundred photos each. This might take a while.

First round of (re)training: Inception

The Inception network (v3, in this case) is a pre-trained Tensorflow network which we can re-train on our own images. It’s a tad over-powered for what we need here according to our tutorial: “Inception is a huge image classification model with millions of parameters that can differentiate a large number of kinds of images. We’re only training the final layer of that network, so training will end in a reasonable amount of time.”

Inception downloads and goes to work. This is my cue: I go have lunch. It might take up to 30 minutes.

Half an hour later I’m back. I’ve had lunch, the Roomba has cleaned the kitchen. The training was done.

Final test accuracy = 91.4% (N=162)

Train your own

Now it was time for me to take it to the next level: Put Tensorflow to work on my own image training set. I decided to go with a few members of the ThingsCon family. Iskander, Marcel, Max, Monique, Simon, and myself: 6 people total, with around 10-20 photos of each.

Now, these photos are mostly from conferences and other ThingsCon-related activities: During our summer camp and our Shenzhen trip. I added some personal ones, too.

A bunch are really horrible photos I included to really test the results: In addition to a tiny sample of training images, some are really hard to discern even for human eyes. (There’s one that contains only a small part of Max’s face, for example—his gorgeous giant blond beard, but nothing else.) Lots are group pics. Many contain not just one but two or more of the people in this sample. These are hard images to train on.

Let’s see how it goes. I swap out the folders and files and run Inception again.


I had been warned about this. If a sample is too tiny, the network sometimes can’t handle it. We need for pics! I pull a few more from personal files, a few off of the web. Now it’s just over 20 images per “category”, aka person. Let’s try this again.


Still no luck. My working theory is that it’s too many photos with several of the yet-to-learn people in them, so the results are ambiguous. I add more pics I find online for every person.

I don’t want to make it too easy though, so I keep adding lots of pics in super low resolution. Think thumbnails. Am I helping? Probably not. But hey, onwards in the name of science!

Going back through the training set I realize just how many of these pics contain several of the yet-to-learn categories. Garbage in, garbage out. No wonder this isn’t working!

Even something as simple as this drives home the big point of machine learning: It’s all about your data set!

I do some manual cropping so that Inception has something to work with. A clean data set with unambiguous categories. And voilà, it runs.

Now, after these few tests, I snap two selfies, one with glasses and one without.

The output without glasses:

peter (score = 0.66335) max (score = 0.14525) monique (score = 0.07219) simon (score = 0.05728) marcel (score = 0.04428) iskander (score = 0.01765)

The output with glasses:

peter (score = 0.75252) max (score = 0.12352) simon (score = 0.05971) monique (score = 0.04001) marcel (score = 0.01397) iskander (score = 0.01027)

Interestingly, with glasses the algorithm recognizes me better even though I don’t wear any in the other images. Mysterious, but two out of two. I’ll take it!

How about accuracy?

The tests above are the equivalent of a “hello world” for machine learning: The most basic, simple program you can try. They use the Inception network that’s been built and trained for weeks by Google, and just add one final layer on top, to great effect.

That said, it’s still interesting to look at the outcomes, and which factors influence the results. So let’s run the same analysis for 500 iterations compared to, say, 4.000!

The test image I use is a tricky one: It’s of Michelle, a hand in front of her face.

500 iterations on a set of photos (this time, of family members):

michelle (score = 0.53117)

This isn’t the result of a confident algorithm!

So for comparison, let’s see the results for 4.000 iterations on the same training set:

michelle (score = 0.75689)

Now we’re talking!

At this point I’m quite happy with the results. For a first test, this delivers impressive results and, maybe even more importantly, is an incredible demonstration of the massive progress we’ve seen in the tooling for machine learning over the last few years.

18 May


Some thoughts on Google I/O and AI futures

May 18, 2017 | By |

Google’s developer conference Google I/O has been taking place these last couple of days, and oh boy has there been some gems in CEO Sundar Pichai’s announcements.

Just to get this out right at the top: Many analysts’ reactions to the announcements were a little meh: Too incremental, not enough consumer-facing product news, they seemed to find. I was surprised to read and hear that. For me, this year’s I/O announcements were huge. I haven’t been as excited about the future of Google in a long, long time. As far as I can tell, Google’s future today looks a lot brighter still than yesterday’s.

Let’s dig into why.

Just as a quick overview and refresher, some of the key announcements (some links to write-ups included).

Let’s start with some announcements of a more general nature around market penetration and areas of focus:

  • There are now 2 billion active Android devices
  • Google Assistant comes to iOS (Wired)
  • Google has some new VR and AR products and collaborations in the making, both through their Daydream platform and stand-alone headsets

Impressive, but not super exciting; let’s move on to where the meat is: Artificial intelligence (AI), and more specifically machine learning (ML). Google announced a year ago to turn into an AI-first company. And they’re certainly making true on that promise:

  • Google Lens super-charges your phone camera with machine learning to recognize what you’re pointing the camera at and give you context and contextual actions (Wired)
  • Google turns Google Photos up to 11 through machine learning (via Lens), including not just facial recognition but also smart sharing.
  • Copy & paste gets much smarter through machine learning
  • Google Home can differentiate several users (through machine learning?)
  • Google Assistant’s SDK allows other companies and developers to include Assistant in their products (and not just in English, either)
  • Cloud TPU is the new hardware that Google launches for machine learning (Wired)
  • Google uses neural nets to design better neural nets

Here’s a 10min summary video from The Verge.

This is incredible. Every aspect of Google, both backend and frontend, is impacted by machine learning. Including their design of neural networks, which are improved by neural networks!

So what we see there are some incremental (if, in my book, significant) updates in consumer-facing products. This is mostly feature level:

  • Better copy & paste
  • Better facial recognition in photos (their “vision error rate of computer vision algorithms is now better than the human error rate”, says ZDNet)
  • Smarter photo sharing (“share all photos of our daughter with my partner automatically”)
  • Live translation and contextual actions based on photos (like pointing camera at wifi router to read login credentials and log you into the router automatically).
  • Google Home can tell different users apart.

As features, these are nice-to-haves, not must-haves. However, they’re powered by AI. That changes everything. This is large-scale deployment of machine learning in consumer products. And not just consumer products.

Google’s AI-powered offerings also power other businesses now:

  • Assistant can be included in third party products, like Amazon’s Alexa. This increases reach, and also the datasets available to train the machine learning algorithms further.
  • The new Cloud TPU chips, combined with Google’s cloud-based machine learning framework around TensorFlow means that they’re not in the business of providing machine learning infrastructure: AI-as-a-Service (AssS).

It’s this last point that I find extremely exciting. Google just won the next 10 years.

The market for AI infrastructure—for AI-as-a-Service—is going to be mostly Google & Amazon (who already has a tremendous machine learning offering). The other players in that field (IBM, and maybe Microsoft at some point?) aren’t even in the same ballpark. Potentially there will be some newcomers; it doesn’t look like any of the other big tech companies will be huge players in that field.

As of today, Google sells AI infrastructure. This is a mode that we know from Amazon (where it has been working brilliantly), but so far hadn’t really known from Google.

There haven’t been many groundbreaking consumer-facing announcements at I/O. However, the future has never looked brighter for Google. Machine learning just became a lot more real and concrete. This is going to be exciting to watch.

At the same time, now’s the best time to think about societal implications, risks, and opportunities inherent in machine learning at scale: We’re on it. In my work as well as our community over at ThingsCon we’ve been tracking and discussing these issues in the context of Internet of Things for a long time. I see AI and machine learning as a logical continuation of this same debate. So in all my roles I’ll continue to advocate for a responsible, ethical, human-centric approach to emerging technologies.

Full disclosure: I’ve worked many times with different parts of Google, most recently with the Mountain View policy team. I do not, at the time of this writing, have a business relationship with Google. (I am, however, a heavy user of Google products.) Nothing I write here is based on any kind of information that isn’t publicly available.

29 Apr


Are we the last generation who experienced privacy as a default?

April 29, 2017 | By |

Attack of the VR headsets! Admittedly, this photo has little to do with the topic of this blog post. But I liked it, so there you go.

The internet, it seems, has turned against us. Once a utopian vision of free and plentiful information and knowledge for all to read. Of human connection. Instead, it has turned into a beast that reads us. Instead of human connection, all too often we are force-connected to things.

This began in the purely digital realm. It’s long since started to expand into the physical world, through all types of connected products and services who track us—notionally—for our own good. Our convenience. Our personalized service. On a bad day I’m tempted to say we’ve all allowed to be turned into things as part of the the internet of things.


I was born in 1980. Just on the line that marks the outer limit of millenial. Am I part of that demographic? I can’t tell. It doesn’t matter. What matters is this:

Those of us born around that time might be the last generation that grew up who experienced privacy as a default.


When I grew up there was no reason to expect surveillance. Instead there was plenty of personal space: Near-total privacy, except for neighbors looking out of their windows. Also, the other side of that coin, near total boredom—certainly disconnection.

(Edit: This reflects growing up in the West, specifically in Germany, in the early 1980s—it’s not a shared universal experience, as Peter Rukavina rightfully points out in the comments. Thanks Peter!)

All of this within reason: It was a small town, the time was pre-internet, or at least pre-internet access for us. Nothing momentous had happened in that small town in decades if not centuries. There it was possible to have a reasonably good childhood: Healthy and reasonably wealthy, certainly by global standards. What in hindsight feels like endless summers. Nostalgia past, of course. It could be quite boring. Most of my friends lived a few towns away. The local library was tiny. The movie theater was a general-purpose event location that showed two movies per week, on Monday evenings. First one for children, than one for teenagers and adults. The old man who ticketed us also made popcorn, sometimes. I’m sure he also ran the projector.

Access to new information was slow, dripping. A magazine here and there. A copied VHS or audio tape. A CD purchased during next week’s trip to the city, if there was time to browse the shelves. The internet was becoming a thing, I kept reading about it. But until 1997, access was impossible for me. Somehow we didn’t get the dialup to work just right.

What worked was dialing into two local BBS systems. You could chat with one other person on one, with three in the other. FIDO net made it possible to have some discussions online, albeit ever so slowly.


When I grew up there was no expectation of surveillance. Ads weren’t targeted. They weren’t even online, but on TV and newspapers. They were there for you to read, every so often. Both were boring. But neither TVs nor newspapers tried to read you back.


A few years ago I visited Milford Sound. It’s a fjord on the southern end of New Zealand. It’s spectacular. It’s gorgeous. It rains almost year round.

If I remember a little info display at Milford Sound correctly, the man who first started settling there was a true loner. He didn’t mind living there by himself for decades. Nor, it seems, when the woman who was to become his wife joined. It’s not entirely clear how he liked that visitors started showing up.

Today it’s a grade A tourist destination, if not exactly for mass tourism. It looks and feels like the end of the world. In some ways, it is.

As we sought shelter from the pouring rain in the boat terminal’s cafeteria, our phones had no signal. Even there, though, you could connect to the internet.

Connectivity in Milford Sound comes at a steep price

Internet access in Milford Sound is expensive enough that it might just suffice to stay offline for a bit. It worked for us. But even there, though they might be disconnected, the temps who work there during tourist season probably don’t get real privacy. On a work & travel visa, you’re likely to live in a dorm situation.


The internet has started to track every move we make online. I’m not even talking about governance or criminal surveillance. Called ad tech, online advertisements that track your every move notice more about you than you about them. These are commercial trackers. On speed. They aren’t restricted to one website, either. If you’ve ever searched for a product online you’ll have noticed that it keeps following you around. Even the best ad blockers don’t guarantee protection.

Some companies have been called out because they use cookies that track your behavior that can’t be deleted. That’s right, they track you even if you explicitly try to delete them. Have you given your consent? Legally, probably—it’s certainly hidden somewhere in your mobile ISP’s terms of service. But really, of course you haven’t agreed. Nobody in their right mind would.


Today we’re on the brink of taking this to the the next level with connected devices. It started with smartphones. Depending on your mobile ISP, your phone might report back your location and they might sell your movement data to paying clients right now. Anonymized? Probably, a little. But these protections never really work.

Let’s not but let’s be very deliberate about our next steps. The internet has brought tremendous good first, and then opened the door to tracking and surveillance abuse. IoT might go straight for the jugular without the benefits – if we make it so. If we allow to let that happen.


The internet, it seems, has turned against us. But maybe it’s not too late just yet. Maybe we can turn the internet around, especially the internet of things. And make it work for all of us again. The key is to reign in tracking and surveillance. Let’s start with ad tech.

29 Apr


Monthnotes for April 2017

April 29, 2017 | By |

A bird’s-eye view of Shenzhen’s HuaqiangBei market road

Sitrep: I’m in Madrid, fighting jetlag with strong Americanos in a lovely little neighborhood café. When I got up from the last real bed I had been in Shenzhen. In the 30 or so hours since then, I rode cabs, ferries, metros and planes; I strolled through Hong Kong and tried not to fall asleep in Abu Dhabi. But now I’m here, and using the temporary downtime of a rainy post-lunch Saturday Madrid afternoon to write up these #monthnotes while everything’s still fresh on my mind.

April just flew by. A deep dive in not one but two writing projects followed by the above-mentioned trip to Shenzhen meant it was a month full of intense input and output—lots and lots of both.

Read More

28 Apr


View Source II: ThingsCon goes Shenzhen (Part II)

April 28, 2017 | By |

Outside HuaqiangBei market, the street looks like a regular retail zone. But inside, it’s unlike any market you’ve ever seen.

TL;DR: Read all notes from our recent, second Shenzhen trip—I’d recommend to start at the beginning.

Last fall, we gathered a small group for an expedition to Shenzhen, China: The Silicon Valley of hardware, where most connected products are produced. We named the trip View Source: Shenzhen (click to read all related posts to that former trip; link to the current one below). It was our way to understand better how this incredible hardware ecosystem works, and how indie IoT makers and entrepreneurs can interface with it.

One of many interviews with designers and manufacturers in Shenzhen

In April 2017 we went back to Shenzhen, with a larger delegation: Code name View Source II. There we also held the first ThingsCon Shenzhen event.

Kicking off ThingsCon Shenzhen with the ThingsCon mantra

We’ll have a “proper” write-up later. For now, I’m happy to share my quick & dirty personal travel notes over on my personal blog. Read all View Source II posts—I’d recommend to start at the beginning.

19 Apr


AI: Process v Output

April 19, 2017 | By |

TL;DR: Machine learning and artificial intelligence (AI) are beginning to govern ever-greater parts of our lives. If we want to trust their analyses and recommendations, it’s crucial that we understand how they reach their conclusions, how they work, which biases are at play. Alas, that’s pretty tricky. This article explores why.

As machine learning and AI gain importance and manifest in many ways large and small wherever we look, we face some hard questions: Do we understand how algorithms make decisions? Do we trust them? How do we want to deploy them? Do we trust the output, or focus on process?

Please note that this post explores some of these questions, connecting dots from a wide range of recent articles. Some are quoted heavily (like Will Knight’s, Jeff Bezos’s, Dan Hon’s) and linked multiple times over for easier source verification rather than going with endnotes. The post is quite exploratory in that I’m essentially thinking out loud, and asking more questions than I have answers to: tread gently.


In his very good and very interesting 2017 shareholder letter, Jeff Bezos makes a point about not over-valuing process: “The process is not the thing. It’s always worth asking, do we own the process or does the process own us?” This, of course, he writes in the context of management: His point is about optimizing for innovation. About not blindly trusting process over human judgement. About not mistaking existing processes for unbreakable rules that are worth following at any price and to be followed unquestioned.

Bezos also briefly touches on machine learning and AI. He notes that Amazon is both an avid user of machine learning as well as building extensive infrastructure for machine learning—and Amazon being Amazon, making it available to third parties as a cloud-based service. The core point is this (emphasis mine): “Over the past decades computers have broadly automated tasks that programmers could describe with clear rules and algorithms. Modern machine learning techniques now allow us to do the same for tasks where describing the precise rules is much harder.”

super mario cube
Algorithms as a black box: Hard to tell what’s going on inside (Image: ThinkGeek)

That’s right: With machine learning, we can learn to get desirable results but without necessarily knowing how to describe the rules that get us there. It’s pure output. No—or hardly any—process in the sense that we can interrogate or clearly understand it. Maybe not even instruct it, exactly.

Let’s keep this at the back of our minds now, we’ll come back to it later. Exhibit A.


In s4e12 of his excellent newsletter Things That Have Caught My Attention, Dan Hon writes, reflecting on Jeff Bezos’ shareholder letter (I replaced Dan’s endnotes with direct links):

“Machine learning techniques – most recently and commonly, neural networks[1] – are getting pretty unreasonably good at achieving outcomes opaquely. In that: we really wouldn’t know where to start in terms of prescribing and describing the precise rules that would allow you to distinguish a cat from a dog. But it turns out that neural networks are unreasonably effective (…) at doing these kinds of things. (…) We’re at the stage where we can throw a bunch of images to a network and also throw a bunch of images of cars at a network and then magic happens and we suddenly get a thing that can recognize cars.”

Dan goes on to speculate:

“If my intuition’s right, this means that the promise of machine learning is something like this: for any process you can think of where there are a bunch of rules and humans make decisions, substitute a machine learning API. (…) machine learning doesn’t necessarily threaten jobs like “write a contract between two parties that accomplishes x, y and z” but instead threatens jobs where management people make decisions.”

In conclusion:

“Neural networks work the other way around: we tell them the outcome and then they say, “forget about the process!”. There doesn’t need to be one. The process is inside the network, encoded in the weights of connections between neurons. It’s a unit that can be cloned, repeated and so on that just does the job of “should this insurance claim be approved”. If we don’t have to worry about process anymore, then that lets us concentrate on the outcome. Does this mean that the promise of machine learning is that, with sufficient data, all we have to do is tell it what outcome we want?”

The AI, our benevolent dictator

Now if we answered Dan’s question with YES, then this is where things get tricky, isn’t it? It opens the door to a potentially pretty slippery slope.

In political science, a classic question is what the best form of government looks like. While a discussion about what “best” means—freedom? wealth? health? agency? for all or for most? what are the tradeoffs?—is fully legitimate and should be revisited every so often, it boils down to this long-standing conflict:

Can a benevolent dictator, unfettered by external restraints, provide a better life for their subjects?


Does the protection of rights, freedom and agency offered by democracy outweigh the often slow and messy decision-making processes it requires?

Spoiler alert: Generally speaking, democracy won this debate a long time ago.

(Of course there are regions where societies have held on to the benevolent dictatorship model; and the recent rise of the populist right demonstrates that populations around the globe can be attracted to this line of argument.)

The reason democracy—a form of government defined by process!—has surpassed dictatorships both benevolent and malicious is that overall it seems a human endeavor to have agency and freely express it, rather than be governed by an unfettered, unrestricted ruler of any sorts.

Every country that chooses democracy over a dictator sacrifices efficiency for process: A process that can be interrogated, understood, adapted. Because, simply stated, a process understood is a process preferred. Being able to understand something gives us power to shape it, to make it work for us: This is true both on the individual and the societal level.

Messy transparency and agency trumps blackbox efficiency.

Let’s keep that in mind, too. Exhibit B.

Who makes the AI?

Andrew Ng, who was heavily involved in Baidu’s (and before Google’s) AI efforts, emphasizes the potential impact of AI to transform society: “Just as electricity transformed many industries roughly 100 years ago, AI will also now change nearly every major industry?—?healthcare, transportation, entertainment, manufacturing?—?enriching the lives of countless people.”

He continues:

“I want all of us to have self-driving cars; conversational computers that we can talk to naturally; and healthcare robots that understand what ails us. The industrial revolution freed humanity from much repetitive physical drudgery; I now want AI to free humanity from repetitive mental drudgery, such as driving in traffic. This work cannot be done by any single company?—?it will be done by the global AI community of researchers and engineers.”

While I share Ng’s assessment of AI’s potential impact, I got to be honest: His raw enthusiasm for AI sounds a little scary to me. Free humanity from mental drudgery? Not to wax overly nostalgic, but mental drudgery—even boredom!—has proven really quite important for humankind’s evolution and played a major role in its achievements. Plus, the idea that engineers are the driving force seems risky at least: It’s a pure form of stereotypical Silicon Valley think, almost a cliché. I’m willing to give him the benefit of the doubt and assume that by “researchers” he also meant to include anthropologists, philosophers, political scientists, and all the other valuable perspectives of social sciences, humanities, and other related fields.

Don’t leave something as important as AI to a bunch of tech bros (Image: Giphy)

Something as transformative as this should not, in the 21st century, be driven by a tiny group of people with very homogenous backgrounds. Diversity is key, in professional backgrounds and ways of thinking as much as in gender, ethnic, regional and cultural backgrounds. Otherwise, algorithms are bound to encode and help enforce unhealthy policies.

Engineering-driven, tech-deterministic, non-diverse expansionist thinking delivers sub-optimum results. File under exhibit C.

Automated decision-making

Bezos writes about the importance of making decisions fast, which often requires making them with incomplete information: “most decisions should probably be made with somewhere around 70% of the information you wish you had. If you wait for 90%, in most cases, you’re probably being slow. Plus, either way, you need to be good at quickly recognizing and correcting bad decisions. If you’re good at course correcting, being wrong may be less costly than you think, whereas being slow is going to be expensive for sure.”

This, again, he writes in the context of management—presumably by and through humans. How will algorithmic decision-making fit into this picture? Will we want our algorithms to start deciding—or issuing recommendations—based on 100 percent of information? 90? 70? Maybe there’s an algorithm that figures out through machine learning how much information is just enough to be good enough?

Who is responsible for making algorithmically-made decisions? Who bears the responsibility for enforcing them?

If the algorithmic load-optimizing (read: overbooking), tells airline staff to remove a passenger from a plane and it ends up in a dehumanizing debacle, who’s fault is that?

Teacher of Algorithm by Simone Rebaudengo and Daniel Prost

More Dan Hon! Dan takes this to its logical conclusion (s4e11): “We’ve outsourced deciding things, and computers – through their ability to diligently enact policy, rules and procedures (surprise! algorithms!) give us a get out of jail free card that we’re all too happy to employ.” It is, in extension, a jacked up version of “it’s policy, it’s always been our policy, nothing I can do about it.” Which is, of course, the oldest and laziest cop-out there ever was.

He continues: “Algorithms make decisions and we implement them in software. The easy way out is to design them in such a a way as to remove the human from the loop. A perfect system. But, there is no such thing. The universe is complicated, and Things Happen. While software can deal with that (…) we can take a step back and say: that is not the outcome we want. It is not the outcome that conscious beings that experience suffering deserve. We can do better.”

I wholeheartedly agree.

To get back to the airline example: In this case I’d argue the algorithm was not at fault. What was at fault is that corporate policy said this procedure has priority, and this was backed up by an organizational culture that made it seem acceptable (or even required) for staff to have police drag a paying passenger off a plane with a bloody lip.

Algorithms blindly followed, backed up by corporate policies and an unhealthy organizational culture: Exhibit D.


In the realm of computer vision, there’s been a lot of advances through (and for) machine learning lately. Generative adversarial networks (GANs), in which one network tries to fool another, seem particularly promising. I won’t pretend to understand the math behind GANs, but Quora got us covered:

“Imagine an aspiring painter who wants to do art forgery (G), and someone who wants to earn his living by judging paintings (D). You start by showing D some examples of work by Picasso. Then G produces paintings in an attempt to fool D every time, making him believe they are Picasso originals. Sometimes it succeeds; however as D starts learning more about Picasso style (looking at more examples), G has a harder time fooling D, so he has to do better. As this process continues, not only D gets really good in telling apart what is Picasso and what is not, but also G gets really good at forging Picasso paintings. This is the idea behind GANs.”

So we got two algorithmic networks sparring with one another. Both of them learn a lot, fast.

Impressive, if maybe not lifesaving results include so-called style transfer. You’ve probably seen it online: This is when you can upload a photo and it’s rendered in the style of a famous painter:

collection style transfer examples
Collection Style Transfer refers to transferring images into artistic styles. Here: Monet, Van Gogh, Ukiyo-e, and Cezanne. (Image source: Jun-Yan Zhu)

Maybe more intuitively impressive, this type of machine learning can also be applied to changing parts of images, or even videos:

failure modesbr> Sometimes, failure modes are not just interesting but also look hilarious (Image source: Jun-Yan Zhu)

This is the kind of algorithmic voodoo that powers things like Snapchat “world lenses” and Facebook’s recent VR announcements (“Act 2“).

Wait, how did we get here? Oh yes, output v process!

Machine learning requires new skills (for creators and users alike)

What about skill sets required to work with machine learning, and to make machines learn in interesting, promising ways?

Google has been remaking itself as a machine learning first company. As Christine Robson, who works on Google’s internal machine learning efforts puts it: “It feels like a living, breathing thing. It’s a different kind of engineering.”

Technology Review features a stunning article absolutely worth reading in full: In The Dark Secret at the Heart of AI, author Will Knight interviews MIT professor Tommi Jaakkola who says:

“Deep learning, the most common of these approaches, represents a fundamentally different way to program computers. ‘It is a problem that is already relevant, and it’s going to be much more relevant in the future. (…) Whether it’s an investment decision, a medical decision, or maybe a military decision, you don’t want to just rely on a ‘black box’ method.’”

And machine learning doesn’t just require different engineering. It requires a different kind of design, too. From Machine Learning for Designers (ebook, free O’Reilly account required): “These technologies will give rise to new design challenges and require new ways of thinking about the design of user interfaces and interactions.”

Machine learning means that algorithms learn from—and increasingly will adapt to—their own performance, user behaviors, and external factors. Processes (however oblique) will change, as will outputs. Quite likely, the interface and experience will also adapt over time. There is no end state but constant evolution.

Technologist & researcher Greg Borenstein argues that “while AI systems have made rapid progress, they are nowhere near being able to autonomously solve any substantive human problem. What they have become is powerful tools that could lead to radically better technology if, and only if, we successfully harness them for human use.”

Borenstein concludes: “What’s needed for AI’s wide adoption is an understanding of how to build interfaces that put the power of these systems in the hands of their human users.”

Future-oriented designers seem to be at least open to this idea. As Fabien Girardin of the Near Future Laboratory argues: “That type of design of system behavior represents a future in the evolution of human-centered design.”

Computers beating the best human Chess and Go players have given us Centaur Chess in which humans and computers play side-by-side in a team: While computers beat humans in chess, these hybrid team of humans and computers playing in tandem beat computers hands-down.

In centaur chess, software provides analysis and recommendations, a human expert makes the final call. (I’d be interested in seeing the reverse being tested, too: What if human experts gave recommendations for the algorithms to consider?)

How does this work? Why is it doing this?

Now, all of this isn’t particularly well understood today. Or more concretely, the algorithms hatched that way aren’t understood, and hence their decisions and recommendations can’t be interrogated easily.

Will Knight shares the story of a self-driving experimental vehicle that was “unlike anything demonstrated by Google, Tesla, or General Motors, and it showed the rising power of artificial intelligence. The car didn’t follow a single instruction provided by an engineer or programmer. Instead, it relied entirely on an algorithm that had taught itself to drive by watching a human do it.”

What makes this really interesting is that it’s not entirely clear how the algorithms learned:

The system is so complicated that even the engineers who designed it may struggle to isolate the reason for any single action. And you can’t ask it: there is no obvious way to design such a system so that it could always explain why it did what it did (…) It isn’t completely clear how the car makes its decisions.”

Knight stresses just how novel this is: “We’ve never before built machines that operate in ways their creators don’t understand.”

We know that it’s possible to attack machine learning with adversarial examples: So-called adversarial examples are intentionally designed to cause the model to make a mistake, to train the algorithm incorrectly. Even without a malicious attack, algorithms also simply don’t always get the full—or right—picture: “Google researchers noted that when its [Deep Dream] algorithm generated images of a dumbbell, it also generated a human arm holding it. The machine had concluded that an arm was part of the thing.”

This—and this type of failure mode—seems relevant. We need to understand how algorithms work in order to adapt, improve, and eventually trust them.

Consider for example two areas where algorithmic decision-making could directly decide about life or death: Military and medicine. Speaking of military use cases, David Dunning of DARPA’s Explainable Artificial Intelligence program explains: “It’s often the nature of these machine-learning systems that they produce a lot of false alarms, so an intel analyst really needs extra help to understand why a recommendation was made.” Life or death might literally depend on it. What’s more, if a human operator doesn’t fully trust the AI output then that output is rendered useless.

We need to understand how algorithms work (Image: Giphy)

Should we have a legal right to interrogate AI decision making? Again, Knight in Technology Review: “Starting in the summer of 2018, the European Union may require that companies be able to give users an explanation for decisions that automated systems reach. This might be impossible, even for systems that seem relatively simple on the surface, such as the apps and websites that use deep learning to serve ads or recommend songs. The computers that run those services have programmed themselves, and they have done it in ways we cannot understand. Even the engineers who build these apps cannot fully explain their behavior.”

It seems likely that this could currently not even be enforced, that the creators of these algorithmic decision-making systems might not even be able to find out what exactly is going on.

There have been numerous attempt of exploring this, usually through visualizations. This works, to a degree, for machine learning and even other areas. However, often machine learning is used to crunch multi-dimensional data sets. We simply have no great way of visualizing this in a way that makes it easy to analyze (yet).

This is worrisome to say the least.

But let me play devil’s advocate for a moment: What if the outcomes are really so good, so much better than the human-powered analysis or decision-making skills. Might not using them be simply irresponsible? Knight gives the example of a program at Mount Sinai Hospital in New York called Deep Patient that was “just way better” at predicting certain diseases from patient records.

If this prediction algorithm has a solid track record of successful analysis, but neither developers nor doctors understand how it reaches its conclusions, is it responsible to prescribe medication based on its recommendation? Would it be responsible not to?

Philosopher Daniel Dennett who studies consciousness of the mind takes it a step further. An explanation by an algorithm might not be good enough. Humans aren’t great at explaining themselves, so if an AI “can’t do better than us at explaining what it’s doing, then don’t trust it.”

It follows that an AI would need to provide a much better explanation than a human in order for it to be trustworthy. Exhibit E.

Now where does that leave us?

Let’s assume that the impact of machine learning, algorithmic decision-making and AI will keep increasing. A lot. Then We need to understand how algorithms work in order to adapt, improve, and eventually trust them.

Machine learning allows us to get desirable results, but without necessarily knowing how (exhibit A). It’s essential for a society to be able to understand and shape its governance, and to have agency in doing so. So in AI just like in governance: Transparent messiness is more desirable than oblique efficiency. Black boxes simply won’t do. We cannot have black boxes govern our lives (exhibit B). Something as transformative as this should not, in the 21st century, be driven by a tiny group of people with very homogenous backgrounds. Diversity is key, in professional backgrounds and ways of thinking as much as in gender, ethnic, regional and cultural backgrounds. Engineering-driven, tech-deterministic, non-diverse, expansionist thinking delivers sub-optimum results (exhibit C). Otherwise, algorithms are bound to encode and help enforce unhealthy policies. Blindly followed, backed up by corporate policies and an unhealthy organizational culture, this is bound to deliver horrible results (exhibit D). Hence we need to be able to interrogate algorithmic decision-making. And if in doubt, an AI should provide a much better explanation than a human in order for it to be trustworthy (exhibit E).

Machine learning and AI hold great potential to improve our lives. Let’s embrace it, but deliberately and cautiously. And let’s not hand over the keys to software that’s too complex for us to interrogate, understand, and hence shape to serve us. We must apply the same kinds of checks and balances to tech-based governance as to human or legal forms of governance—accountability & democratic oversight and all.

31 Mar


Monthnotes for March 2017

March 31, 2017 | By |

As spring time breathes fresh life into Berlin, month was a productive month of heads-down writing time. Also, in March the company turned 3!

Read More