Swaroop Vattam
 I am a technical staff member at MIT's Lincoln Laboratory, where I am part of the Human Language Technology group. Prior to joining MIT, I was a research scientist in the College of Computing at Georgia Tech and a postdoctoral fellow at NRL (fellowship awarded by the National Academies of Sciences). I received my PhD in Computer Science from Georgia Tech (under the supervision of Ashok Goel) for which I won the distinguished dissertation award in 2012. My main research interests are machine reasoning, machine learning, and natural language processing. To view all of my publications on Google Scholar, please click here. 
musings ...

 Representing data as matrices is ubiquitous in Machine Learning (ML). But the challenge for scalable ML systems is that these structures can become enormous, with millions or hundred of millions of rows. One way to mitigate this problem is to reduce the size of the matrices by leaving out a bunch of rows. But in order for computations on them to yield approximately the right results, the remaining rows have to be in some sense representative of those that were omitted. Cohen and Peng (2014) have developed an algorithm to efficiently sample the rows of a matrix while preserving the pnorms of its product with vectors, finding the smallest possible approximation of the original matrix that guarantees reliable computations. They demonstrate that their algorithm is optimal for condensing matrices under any norm. This is an important contribution towards making ML more scalable.


 Last night I watched the movie "The man who knew infinity," a biopic of the Indian math genius Ramanujan. I enjoyed the movie and felt that it was a much better movie compared to "The imitation game," about the life of another great mathematician Alan Turing. Some dramatic license was taken to make the movie, but that's to be expected. Interestingly, I found out that Ken Ono was involved in the making of the movie. He is a wellknown number theorist at Emory University (in Atlanta) and I happened to attend one of his talks in 2011. Ono and his student put out a paper on arXiv recently called "The 1729 K3 Surface," where they revisit the famous "taxicab" number. After studying Ramanujan's writings firsthand, they found that he was thinking about K3 surfaces several decades before the concept was even coined. This adds yet another chapter to the list of spectacular recent discoveries involving Ramanujan's notebook. K3 surfaces is an exceedingly difficult mountain to climb and an important next frontier in mathematics. That Ramanujan gave remarkable examples illustrating some of their features in 1919 simply blows my mind.


 A lot of datasets these days come in the form of large and complex networks. They describe intricate systems with entities being modeled as nodes and their relationships as edges. These networks contain a wealth of information, but that information is often hidden within network patterns which are difficult to uncover. While deciphering these patterns is crucial, computational analyses of large networks are often intractable. So, many questions we ask about the world cannot be answered exactly even with unlimited compute power and time. Our only hope is to answer these questions approximately (i.e., heuristically) and prove how far the approximate answer is from the exact one in the worst case. So far, a majority of scalable heuristics for network analysis consider only simple descriptors of a network such as node count and degree, capturing only its lowerorder connectivity patterns. But they suffer from a severe limitation: two networks can appear similar when viewed through the lens of simple descriptors, but can reveal a very different connectivity structure when we change the lens to include higherorder descriptors called graphlets (small subgraphs). While this limitation has been known for some time, nothing could be done because this was a tough nut to crack. A scalable technique for uncovering higherorder organization of large networks  at the level of graphlets  was not on the horizon. But to my surprise, I came across a recent article by Benson et al. titled "Higherorder organization of complex networks" in Science (2016) where they propose a new framework for clustering networks on the basis of higherorder connectivity patterns, with guarantees on the optimality of obtained clusters and which scales to billions of edges. This is such an amazing result. Kudos to Benson at al.
 Benson, A. R., Gleich, D. F., & Leskovec, J. (2016). Higherorder organization of complex networks. Science, 353(6295), 163166.


 Extreme theoremproving!!! Researchers have solved a single math problem (boolean Pythagorean Triples problem) using bruteforce (SAT solvers on supercomputers, yay!), producing the largest proof ever  over 200 terabytes in uncompressed length. How did they check it? If a computer generated proof is too big for humans to check, what cab be said of its epistemic status? That aside, this is big deal. It shows how far we have come since the time when the SAT problem's NPcompleteness was understood in the early 1970's. We seem to be readily embracing SAT solvers instead of avoiding them like plague for tractability reasons. The solver community is optimistic that they can solve most SAT problems arising in practical applications. Confused? An explanation for this gap is provided by Dick Lipton in one of his blog entries.


 The lab that I consider my academic home at Georgia Tech was recently in the news for yet another epic feat. They developed and deployed an AI teaching assistant (TA) called Jill that the students could not tell from the other human TA's until it was announced at the end of the semester. The best part, Jill was TA'ing for an AI course!! It was also featured in the Washington Post.


 Today is the day of the IJCAI Goal Reasoning workshop. Wish I was there! As a program committee member, I was fortunate to review three excellent papers for this workshop. This workshop has grown in stature quite a bit since it's first meeting in 2010  case in point, AI Communications has agreed to host a Special Issue on Goal Reasoning and selected papers from the workshop will be invited to participate in the special issue.


 I started reading Harry Potter in Methods of Rationality (HPMOR) after a friend recommended it to me. It absolutely amazing! From what I gather, HPMOR is a fan fiction by Yudkowsky, who is a brilliant mind (depending on who you ask) and a research fellow at MIRI. It poses an alternative world in which Harry is a genius not only in magic, but also prodigy who is fiercely loyal to logic, science, progress, and enlightenment. How so? His step parents are not silly, meanspirited villains, but the best muggles Dumbledore could find  an oxford scientist. Here is an excerpt when Harry quickly figures out how stupid the rules of Quidditch are and refuses to play: "That’s just wrong. That violates every possible rule of game design. Look, the rest of this game sounds like it might make sense, sort of, for a sport I mean, but you’re basically saying that catching the Snitch overwhelms almost any ordinary point spread. … It’s like someone took a real game and grafted on this pointless extra position so that you could be the Most Important Player without needing to really get involved or learn the rest of it. Who was the first Seeker, the King’s idiot son who wanted to play Quidditch but couldn’t understand the rules?” ... Ron’s face pulled into a scowl. “If you don’t like Quidditch, you don’t have to make fun of it!” ... “If you can’t criticise, you can’t optimise. I’m suggesting how to improve the game. And it’s very simple. Get rid of the Snitch."


Last modified 3 August 2016 at 11:30 am by svattam 