19 Jul 2018
· 3 min read
— shared on
Lobsters,
Hacker News,
Reddit and
Twitter
Programs like PowerPoint, Keynote, and Adobe Illustrator are common tools for
designing posters, but these programs have a number of disadvantages, including
lack of separation of content and presentation and lack of
programmatic control over the output. Designing posters using these programs
can require countless hours calculating positions of elements by hand, manually
laying out content, manually propagating style changes, and repeating these
kinds of tasks over and over again during the iterative process of poster
design.
The idea of using a document preparation system like LaTeX to implement a
poster using code sounds fantastic, and indeed, there are a number of LaTeX
templates and packages for making posters, such as a0poster, sciposter,
and beamerposter. However, I didn’t like the look of the existing themes
and templates — they all looked 20 years old — and this is what kept me
from using LaTeX for making posters, even though I had been using the software
for years for authoring documents.
I finally bit the bullet and spent some time designing a clean, stylish, and
minimal poster theme for LaTeX, building on top of the beamerposter package.
The result has been open-sourced as Gemini, and it makes it really easy to
design posters that look like this:

Why LaTeX?
There are a number of programs commonly used for making academic posters. These
include:
- Word processing programs (e.g. Word, Pages, and LibreOffice Writer)
- Presentation programs (e.g. PowerPoint, Keynote, and LibreOffice Impress)
- Vector editing programs (e.g. Adobe Illustrator and Inkscape)
Why use LaTex over these programs? The biggest benefit is that LaTeX does not
require manual effort to lay out contents and apply a uniform style to the
entire poster. All layout and styling is done using code relying on TeX’s
sophisticated layout algorithms, and there is a clean separation of content and
presentation, similar to the content/style separation in HTML/CSS.
There are other benefits as well. TeX is a sophisticated typesetting system
that produces excellent results for text as well as mathematical formulae;
LaTeX packages provide support for plotting and algorithmically
specified diagrams and vector graphics; and beamer provides support for
column-based layout, including variable-width and nested columns. This means
that all content in the poster, not just the text, can be produced using code:
no more screenshots of mathematical equations; no more positioning shapes with
the mouse to create diagrams; no more screenshots of plots where the styling
doesn’t quite match the style of the poster; and no more manual positioning of
blocks.
A modern LaTeX poster theme
Building posters with LaTeX is by far a better experience than using
PowerPoint, Keynote, or Illustrator. I felt that the one thing missing was an
aesthetically pleasing poster theme. There’s no reason a poster designed using
LaTeX should look any less beautiful than a poster made using graphic design
software like Adobe Illustrator.
This is what led to the creation of Gemini, a LaTex poster theme with a focus
on being clean, minimal, and looking great out of the box while being
customizable:


The theme is actually a pretty small amount of code; most of the functionality
is provided by LaTeX and beamerposter. But making conscious choices on title
and block layout, font families, font weights, color schemes, and other little
details makes a pretty big difference in how the poster looks and feels.
03 Apr 2018
· 4 min read
— shared on
Hacker News,
Lobsters,
Reddit,
Twitter and
Hackaday
We turned a MacBook into a touchscreen using only $1 of hardware and a little
bit of computer vision. The proof-of-concept, dubbed “Project Sistine” after
our recreation of the famous painting in the Sistine Chapel, was prototyped
by me, Kevin, Guillermo, and Logan in about 16 hours.
Basic Principle
The basic principle behind Sistine is simple. Surfaces viewed from an angle
tend to look shiny, and you can tell if a finger is touching the surface by
checking if it’s touching its own reflection.

Kevin, back in middle school, noticed this phenomenon and built ShinyTouch,
utilizing an external webcam to build a touch input system requiring virtually
no setup. We wanted to see if we could miniaturize the idea and make it work
without an external webcam. Our idea was to retrofit a small mirror in front of
a MacBook’s built-in webcam, so that the webcam would be looking down at the
computer screen at a sharp angle. The camera would be able to see fingers
hovering over or touching the screen, and we’d be able to translate the video
feed into touch events using computer vision.
Hardware
Our hardware setup was simple. All we needed was to position a mirror at the
appropriate angle in front of the webcam. Here is our bill of materials:
- Small mirror
- Rigid paper plate
- Door hinge
- Hot glue
After some iteration, we settled on a design that could be assembled in minutes
using a knife and a hot glue gun.
Here’s the finished product:

Finger Detection
The first step in processing video frames is detecting the finger. Here’s a
typical example of what the webcam sees:

The finger detection algorithm needs to find the touch/hover point for further
processing. Our current approach uses classical computer vision techniques. The
processing pipeline consists of the following steps:
- Filter for skin colors and binary threshold
- Find contours
- Find the two largest contours and ensure that the contours overlap in the
horizontal direction and the smaller one is above the larger one
- Identify the touch/hover point as the midpoint of the line connecting the
top of the bottom contour and the bottom of the top contour
- Distinguish between touch and hover based on the vertical distance between
the two contours

Shown above is the result of applying this process to a frame from the webcam.
The finger and reflection (contours) are outlined in green, the bounding box is
shown in red, and the touch point is shown in magenta.
Mapping and Calibration
The final step in processing the input is mapping the touch/hover point from
webcam coordinates to on-screen coordinates. The two are related by a
homography. We compute the homography matrix through a calibration process
where the user is prompted to touch specific points on the screen. After we
collect data matching webcam coordinates with on-screen coordinates, we can
estimate the homography robustly using RANSAC. This gives us a projection
matrix that maps webcam coordinates to on-screen coordinates.
The video above demonstrates the calibration process, where the user has to
follow a green dot around the screen. The video includes some debug
information, overlaid on live video from the webcam. The touch point in webcam
coordinates is shown in magenta. After the calibration process is complete, the
projection matrix is visualized with red lines, and the software switches to a
mode where the estimated touch point is shown as a blue dot.
Applications
In the current prototype, we translate hover and touch into mouse events,
making existing applications instantly touch-enabled.
If we were writing our own touch-enabled apps, we could directly make use of
touch data, including information such as hover height.
Conclusion
Project Sistine is a proof-of-concept that turns a laptop into a touchscreen
using only $1 of hardware, and for a prototype, it works pretty well! With some
simple modifications such as a higher resolution webcam (ours was 480p) and a
curved mirror that allows the webcam to capture the entire screen, Sistine
could become a practical low-cost touchscreen system.
25 Jul 2017
· 6 min read
— shared on
Hacker News,
Lobsters,
Reddit and
Twitter
Synthesizing adversarial examples for neural networks is surprisingly easy: small, carefully-crafted perturbations to inputs can cause neural networks to misclassify inputs in arbitrarily chosen ways. Given that adversarial examples transfer to the physical world and can be made extremely robust, this is a real security concern.
In this post, we give a brief introduction to algorithms for synthesizing adversarial examples, and we walk through the process of implementing attacks in TensorFlow, building up to synthesizing a robust adversarial example following this technique.
This post is an executable Jupyter notebook: you’re encouraged to download it and experiment with the examples yourself!
Setup
We choose to attack an Inception v3 network trained on ImageNet. In this section, we load a pre-trained network from the TF-slim image classification library. This part isn’t particularly interesting, so feel free to skip this section.
import tensorflow as tf
import tensorflow.contrib.slim as slim
import tensorflow.contrib.slim.nets as nets
tf.logging.set_verbosity(tf.logging.ERROR)
sess = tf.InteractiveSession()
First, we set up the input image. We use a tf.Variable instead of a tf.placeholder because we will need it to be trainable. We can still feed it when we want to.
image = tf.Variable(tf.zeros((299, 299, 3)))
Next, we load the Inception v3 model.
def inception(image, reuse):
preprocessed = tf.multiply(tf.subtract(tf.expand_dims(image, 0), 0.5), 2.0)
arg_scope = nets.inception.inception_v3_arg_scope(weight_decay=0.0)
with slim.arg_scope(arg_scope):
logits, _ = nets.inception.inception_v3(
preprocessed, 1001, is_training=False, reuse=reuse)
logits = logits[:,1:] # ignore background class
probs = tf.nn.softmax(logits) # probabilities
return logits, probs
logits, probs = inception(image, reuse=False)
Next, we load pre-trained weights. This Inception v3 has a top-5 accuracy of 93.9%.
import tempfile
from urllib.request import urlretrieve
import tarfile
import os
data_dir = tempfile.mkdtemp()
inception_tarball, _ = urlretrieve(
'http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz')
tarfile.open(inception_tarball, 'r:gz').extractall(data_dir)
restore_vars = [
var for var in tf.global_variables()
if var.name.startswith('InceptionV3/')
]
saver = tf.train.Saver(restore_vars)
saver.restore(sess, os.path.join(data_dir, 'inception_v3.ckpt'))
Next, we write some code to show an image, classify it, and show the classification result.
import json
import matplotlib.pyplot as plt
imagenet_json, _ = urlretrieve(
'http://www.anishathalye.com/media/2017/07/25/imagenet.json')
with open(imagenet_json) as f:
imagenet_labels = json.load(f)
def classify(img, correct_class=None, target_class=None):
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 8))
fig.sca(ax1)
p = sess.run(probs, feed_dict={image: img})[0]
ax1.imshow(img)
fig.sca(ax1)
topk = list(p.argsort()[-10:][::-1])
topprobs = p[topk]
barlist = ax2.bar(range(10), topprobs)
if target_class in topk:
barlist[topk.index(target_class)].set_color('r')
if correct_class in topk:
barlist[topk.index(correct_class)].set_color('g')
plt.sca(ax2)
plt.ylim([0, 1.1])
plt.xticks(range(10),
[imagenet_labels[i][:15] for i in topk],
rotation='vertical')
fig.subplots_adjust(bottom=0.2)
plt.show()
Example image
We load our example image and make sure it’s classified correctly.
import PIL
import numpy as np
img_path, _ = urlretrieve('http://www.anishathalye.com/media/2017/07/25/cat.jpg')
img_class = 281
img = PIL.Image.open(img_path)
big_dim = max(img.width, img.height)
wide = img.width > img.height
new_w = 299 if not wide else int(img.width * 299 / img.height)
new_h = 299 if wide else int(img.height * 299 / img.width)
img = img.resize((new_w, new_h)).crop((0, 0, 299, 299))
img = (np.asarray(img) / 255.0).astype(np.float32)
classify(img, correct_class=img_class)

Adversarial examples
Given an image , our neural network outputs a probability distribution over labels, . When we craft an adversarial input, we want to find an where is maximized for a target label : that way, our input will be misclassified as the target class. We can ensure that doesn’t look too different from the original by constraining ourselves to some box with radius , requiring that .
In this framework, an adversarial example is the solution to a constrained optimization problem that we can solve using backpropagation and projected gradient descent, basically the same techniques that are used to train networks themselves. The algorithm is simple:
We begin by initializing our adversarial example as . Then, we repeat the following until convergence:
-
-
Initialization
We start with the easiest part: writing a TensorFlow op for initialization.
x = tf.placeholder(tf.float32, (299, 299, 3))
x_hat = image # our trainable adversarial input
assign_op = tf.assign(x_hat, x)
Gradient descent step
Next, we write the gradient descent step to maximize the log probability of the target class (or equivalently, minimize the cross entropy).
learning_rate = tf.placeholder(tf.float32, ())
y_hat = tf.placeholder(tf.int32, ())
labels = tf.one_hot(y_hat, 1000)
loss = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=[labels])
optim_step = tf.train.GradientDescentOptimizer(
learning_rate).minimize(loss, var_list=[x_hat])
Projection step
Finally, we write the projection step to keep our adversarial example visually close to the original image. Additionally, we clip to to keep it a valid image.
epsilon = tf.placeholder(tf.float32, ())
below = x - epsilon
above = x + epsilon
projected = tf.clip_by_value(tf.clip_by_value(x_hat, below, above), 0, 1)
with tf.control_dependencies([projected]):
project_step = tf.assign(x_hat, projected)
Execution
Finally, we’re ready to synthesize an adversarial example. We arbitrarily choose “guacamole” (imagenet class 924) as our target class.
demo_epsilon = 2.0/255.0 # a really small perturbation
demo_lr = 1e-1
demo_steps = 100
demo_target = 924 # "guacamole"
# initialization step
sess.run(assign_op, feed_dict={x: img})
# projected gradient descent
for i in range(demo_steps):
# gradient descent step
_, loss_value = sess.run(
[optim_step, loss],
feed_dict={learning_rate: demo_lr, y_hat: demo_target})
# project step
sess.run(project_step, feed_dict={x: img, epsilon: demo_epsilon})
if (i+1) % 10 == 0:
print('step %d, loss=%g' % (i+1, loss_value))
adv = x_hat.eval() # retrieve the adversarial example
step 10, loss=4.18923
step 20, loss=0.580237
step 30, loss=0.0322334
step 40, loss=0.0209522
step 50, loss=0.0159688
step 60, loss=0.0134457
step 70, loss=0.0117799
step 80, loss=0.0105757
step 90, loss=0.00962179
step 100, loss=0.00886694
This adversarial image is visually indistinguishable from the original, with no visual artifacts. However, it’s classified as “guacamole” with high probability!
classify(adv, correct_class=img_class, target_class=demo_target)

Robust adversarial examples
Now, we go through a more advanced example. We follow our approach for synthesizing robust adversarial examples to find a single perturbation of our cat image that’s simultaneously adversarial under some chosen distribution of transformations. We could choose any distribution of differentiable transformations; in this post, we’ll synthesize a single adversarial input that’s robust to rotation by .
Before we proceed, let’s check if our previous example is still adversarial if we rotate it, say by an angle of .
ex_angle = np.pi/8
angle = tf.placeholder(tf.float32, ())
rotated_image = tf.contrib.image.rotate(image, angle)
rotated_example = rotated_image.eval(feed_dict={image: adv, angle: ex_angle})
classify(rotated_example, correct_class=img_class, target_class=demo_target)

Looks like our original adversarial example is not rotation-invariant!
So, how do we make an adversarial example robust to a distribution of transformations? Given some distribution of transformations , we can maximize , subject to . We can solve this optimization problem via projected gradient descent, noting that is and approximating with samples at each gradient descent step.
Rather than manually implementing the gradient sampling, we can use a trick to get TensorFlow to do it for us: we can model our sampling-based gradient descent as doing gradient descent over an ensemble of stochastic classifiers that randomly sample from the distribution and transform their input before classifying it.
num_samples = 10
average_loss = 0
for i in range(num_samples):
rotated = tf.contrib.image.rotate(
image, tf.random_uniform((), minval=-np.pi/4, maxval=np.pi/4))
rotated_logits, _ = inception(rotated, reuse=True)
average_loss += tf.nn.softmax_cross_entropy_with_logits(
logits=rotated_logits, labels=labels) / num_samples
We can reuse our assign_op and project_step, though we’ll have to write a new optim_step for this new objective.
optim_step = tf.train.GradientDescentOptimizer(
learning_rate).minimize(average_loss, var_list=[x_hat])
Finally, we’re ready to run PGD to generate our adversarial input. As in the previous example, we’ll choose “guacamole” as our target class.
demo_epsilon = 8.0/255.0 # still a pretty small perturbation
demo_lr = 2e-1
demo_steps = 300
demo_target = 924 # "guacamole"
# initialization step
sess.run(assign_op, feed_dict={x: img})
# projected gradient descent
for i in range(demo_steps):
# gradient descent step
_, loss_value = sess.run(
[optim_step, average_loss],
feed_dict={learning_rate: demo_lr, y_hat: demo_target})
# project step
sess.run(project_step, feed_dict={x: img, epsilon: demo_epsilon})
if (i+1) % 50 == 0:
print('step %d, loss=%g' % (i+1, loss_value))
adv_robust = x_hat.eval() # retrieve the adversarial example
step 50, loss=0.0804289
step 100, loss=0.0270499
step 150, loss=0.00771527
step 200, loss=0.00350717
step 250, loss=0.00656128
step 300, loss=0.00226182
This adversarial image is classified as “guacamole” with high confidence, even when it’s rotated!
rotated_example = rotated_image.eval(feed_dict={image: adv_robust, angle: ex_angle})
classify(rotated_example, correct_class=img_class, target_class=demo_target)

Evaluation
Let’s examine the rotation-invariance of the robust adversarial example we produced over the entire range of angles, looking at over .
thetas = np.linspace(-np.pi/4, np.pi/4, 301)
p_naive = []
p_robust = []
for theta in thetas:
rotated = rotated_image.eval(feed_dict={image: adv_robust, angle: theta})
p_robust.append(probs.eval(feed_dict={image: rotated})[0][demo_target])
rotated = rotated_image.eval(feed_dict={image: adv, angle: theta})
p_naive.append(probs.eval(feed_dict={image: rotated})[0][demo_target])
robust_line, = plt.plot(thetas, p_robust, color='b', linewidth=2, label='robust')
naive_line, = plt.plot(thetas, p_naive, color='r', linewidth=2, label='naive')
plt.ylim([0, 1.05])
plt.xlabel('rotation angle')
plt.ylabel('target class probability')
plt.legend(handles=[robust_line, naive_line], loc='lower right')
plt.show()

It’s super effective!