The story about the NSA's
Prism program was broken by the U.K. Guardian yesterday.
refuted those claims today. These are three short notes on the technical implications:
The leaked materials show that the NSA has direct access to internal
user data at various technology companies. The CEOs of all the
implicated companies have denied the existence of backdoors or
unfettered access to the NSA. There is one possible scenario in
which they are both correct.
The NSA undoubtedly has access to peering points where Internet
traffic is transferred between ISPs. All they would need for
unfettered access to user data is the SSL/TLS private key file
used to encrypt traffic. This file is required at all points
where SSL/TLS traffic terminates (i.e., many, many load
balancers and proxies), and is probably less than a kilobyte.
It's small enough to be transferred on paper.
Certificate pinning, elliptical curve cryptography, large
high-entropy keys are all completely useless if the NSA can
subpoena or otherwise acquire the private keys and install
beam splitters at peering points.
There's no need to be a man-in-the-middle when you can be
The acknowledged use of phone-call metadata implies the use
of a communications graph, possibly to find second-degree
nodes that are connected to "suspicious" nodes, or infer
community structure. These topics have been extensively
studied in the last decade (search Google Scholar for
semi-supervised learning and community detection in networks).
The use of graph based methods for detection leads to
graph-based ideas for avoiding detection as well. It's
plausible that knowledge about the detection method will
lead to changes in interaction patterns that will make
the method less effective. What's worse, however, is the possibility
of "metadata attacks",
where elements within a "suspicious" cell create a pattern
of links to a target, causing the target to also be classified
Link prediction is a well-studied
research problem where, given a graph, the goal is to predict links
that are "missing" in some sense. These could be friendships in a
social network (think "people you may know..."), or whatever you
would call links in a network of "suspicious" people. One of the
simplest ways you could do this is to count the number of friends
two people (who are not friends themselves) have in common. The
higher the number of common friends, the more likely it is that
the two people are actually (or should be, or will become) friends.
An excellent but slightly older introduction to the problem is
this excellent paper by
Liben-Nowell et al. The most worrying question is what becomes
of the false-positive links?