# What innovations in CS theory from the last 10 years have had an impact outside of academia?

Well, let’s see.

• Bloom filters are a clever technique that gives practical solutions for a wide variety of problems. The basic idea was first invented in 1970 and saw practical application throughout the decades since then, but theorists picked it up again around 2000 and have devised a long line of variations that have made Bloom filters useful (and in fact used) for many more diverse tasks. See [1] for a survey.
• Streaming algorithms is an area of CS theory that has produced a large number of novel algorithms that make analysis of large datasets tractable. The name was coined about 15 years ago in a paper of Alon, Matias, and Szegedy; much of the work was done in the past decade. Network routers and other devices processing giant streams of data use these algorithms by necessity. See the Wikipedia article [2] for an overview and many links.
• Lattice-based cryptography and post-quantum cryptography generally — this is an area whose major real-world impact has not yet happened, but it’s unusual in that we can be pretty confident it will happen. Quantum computers are coming within the next few decades. When they do, they will make it easy to factor numbers and to compute discrete logarithms, destroying the security of many of the most widely-used cryptosystems that protect the Internet. The last decade has seen a great deal of work on other families of cryptosystems, with a focus on bringing them all the way to practice so that as quantum computing develops we’ll be ready to use them. See http://pqcrypto.org/ and in particular [3]. As a bonus, some of these cryptosystems have practical application today, even without the threat of quantum computers.
• Distributed hash tables, as Nelson pointed out, are another good example. Rather than duplicate his description, I’ll just add a pair of links: the Wikipedia article [4] for lack of a better survey, and one of the early papers [5].

These are just some examples that I happen to know about because of the people I know and the things I was interested in when I was in academia; someone who’s really followed the field well could doubtless produce a much longer list.

So, turning to your follow-up questions, does theoretical computer science matter? Yes, some of it definitely does. Some machine learning research matters too.

Should you go into it as a career? If you’re interested in practical impact, probably not. The incentives in academia don’t reward practical impact, and you have to constantly move against the grain if you want to work toward such impact. I’ve observed this not only in theory but in systems, where grad students do things like write operating systems from scratch in C and assembly, and which might seem as far from theory as it’s possible to get. So I doubt it’s any different in machine learning either.

Instead, if your goal is to have a big impact, you should probably just head for Silicon Valley. Here it’s all about the practical impact, there are plenty of people ambitious to change an industry or the world, and some of them do.

[1] Andrei Broder and Michael Mitzenmacher. “Network Applications of Bloom Filters: A Survey” (2005). http://www.eecs.harvard.edu/~michaelm/postscripts/im2005b.pdf
[2] http://en.wikipedia.org/wiki/Streaming_algorithm
[4] http://en.wikipedia.org/wiki/Distributed_hash_table
[5] Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan. “Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications” (2001). http://pdos.csail.mit.edu/papers/chord:sigcomm01/

What innovations in CS theory from the last 10 years have had an impact outside of academia?

# Should a PhD student compete in programming competitions?

“an advice by Manuel Blum to beginning graduate students: Brains are muscles. They grow strong with exercise. And even if they’re strong, they grow weak without it. A self-contained algorithmic puzzle with cute solutions (that one can also look up after giving up) is one of the best exercises that I’m aware of.”

From the Answer by Richard Peng to problem Should a PhD student compete in programming competitions?

My take on this: unless your opportunity to participate properly was severely limited during undergraduate studies (e.g. if you went to Tsinghua), wait until you got a few major publications before taking part again.

I’m saying this mainly because doing well in programming competitions after the high school level is very practice-dependent. Early graduate studies often revolve around taking courses and reading past literature. Both of these activities directly conflict with the time required to stay in touch with the problem style and maintaing a reasonable coding speed.

That being said, it’s also my impression that as the main activity shifts more to developing new ideas, doing a few problems once a while is quite beneficial for gaining fresh perspectives on things. This is perhaps best explained via an advice by Manuel Blum to beginning graduate students: “Brains are muscles. They grow strong with exercise. And even if they’re strong, they grow weak without it.” A self-contained algorithmic puzzle with cute solutions (that one can also look up after giving up) is one of the best exercises that I’m aware of.

# [OJ report] binary tree right side view

https://leetcode.com/problems/binary-tree-right-side-view/

The solution is at solution @ github .

The idea is to use DFS to do pre-order traversal of the mirror tree. At the same time, use an array to mark current level, so that we can keep from putting wrong element into the result vector.

```class Solution {
public:
void Helper( TreeNode * root, vector<int> & result,
int cur, vector<int> & marked ){
if(root == NULL) return;
// pre-order of the mirror binary tree
// (DFS & just by swapping the traversal order)
if( result.size() == cur ){
// maintain a marked vector to keep from putting
// wrong element into the result vector
result.push_back(root->val);
}
Helper(root->right, result , cur+1, marked);
Helper(root->left , result , cur+1, marked);
}
vector<int> rightSideView(TreeNode * root){
vector<int> result, marked;
if(root == NULL) return result;
Helper(root, result, 0, marked);
return result;
}
};
```

The runtime complexity is $O(n)$, where $n$ is the node number.

The worst-case memory complexity is $O(n) = O(n)$.

# Video talks about system and architecture designs from companies and academia

Just for my reading purpose for the system designs:

Dropbox

Challenges in Mirroring Large MySQL Systems to HBase – Data@Scale

Edgestore – Data@Scale

Large-Scale Low-Latency Storage for the Social Network – Data@Scale https://www.youtube.com/watch?v=5RfFhMwRAic

Storage Systems at a Rapidly Scaling Startup with a Small Team – Data@Scale https://www.youtube.com/watch?v=bLyv8zKa5DU

Mobile Networking Challenges – Networking @Scale https://www.youtube.com/watch?v=V6BBHzX7KCY

Turning Caches into Distributed Systems with mcrouter – Data@Scale https://www.youtube.com/watch?v=e9lTgFO-ZXw

RocksDB: A High Performance Embedded Key-Value Store for Flash Storage – Data@Scale https://www.youtube.com/watch?v=V_C-T5S-w8g

RAMCloud

# Hacking SPICE — tracing the function pointer to launch transient simulation job

The data structure of SPICE analysis is  https://github.com/zhuangh/ngspice/blob/master/src/spicelib/analysis/analysis.h

```struct SPICEanalysis {
IFanalysis if_analysis;
int size;
int domain;
int do_ic;
int (*setParm)(CKTcircuit *ckt, JOB *anal,
int which, IFvalue *value);
int which, IFvalue *value);
int (*an_init)(CKTcircuit *ckt, JOB *job);
int (*an_func)(CKTcircuit *ckt, int restart);
};
```

when SPICE run initialization for that type of variable, it has a corresponding format as (transient simulation as a example)

```
SPICEanalysis TRANinfo  = {
{
"TRAN";,
"Transient analysis",

NUMELEMS(TRANparms),
TRANparms
},
sizeof(TRANan),
TIMEDOMAIN,
1,
TRANsetParm,