Machine Learning with JSAT: 2018

March 2nd I defended my PhD thesis at UMBC! I can now call myself Dr Raff and be not-quite the Jewish doctor my mom had always hoped for!

It's been a whirlwind of a journey, which started a bit over two years ago. I've been very fortunate to have a lot of support from my work, client, friends, and family that allowed me to do this. It makes me hesitant to talk about advice and overall thoughts on the processes, given that mine has been far from normal and quite blessed in terms of resources.

But I do want to write down some of my thoughts on publishing, at least from my limited experience. Before 2016, I had never written a real academic paper before - let alone publish one in a peer-reviewed venue. I've now got 9 peer reviewed papers, 2 papers under revision, 3 papers under review, and two new papers being written. For a two year window I feel that has gone quite well, and overall I've enjoyed being able to share what I've done - but I can't say I've enjoyed the sharing processes itself.

Writing a paper is something I had thought about and knew would be a requirement if I ever did a PhD, but it was something I feared. Written communication has always been a weakness of mine due in part to dyslexia, and I had no concept of what it was like to write a real paper and get it published. There was and still are tons of resources online about how to write the paper itself, but none that I've found talking about how it feels.

The writing of papers itself has so far been surprisingly fun. You've got your intuition about the problem in your head and some ugly code that solves it. Now you are modeling your ideas into a more formal notion - something that can be shared with a wider audience. I've found the writing process itself helps me devise more experiments to run. Imagining I was reading this paper and someone else had written it, what would I like to know more about? What would have convinced me? Then go do that and add it in!

But thats all the writing, then comes the dread.

Submitting your paper for review. You rush to the deadline and get it in, and now you wait. You wait for 2-4 months to hear back from 2-3 of your peers about your paper. When first starting, I assumed reviewers would look for the same things I looked for in papers:

Did I learn something new from this? (Or if you are math-dumb like me, did I understand it at all?)
Are the contributions of the paper helpful to some set of problems / needs, and how big is that set?
Do I gain any new capabilities, or improve existing capabilities?
Do the contributions make my life easier, even if I don't have any new or improved abilities.

At a high level theses seemed reasonable, and I was told that reviewers should be looking for these things. But then I got my first set of reviews back, and second, and third... and I've been surprised at how needesly negative reviewers can be. I should caution that most of my papers have been in the space of ML + Security (specifically malware), so I've got a biased sample pool. But I've found that I almost always get a negative review that has differing complaints. My impression has been that reviewers are often instead looking for (from the reviewer's apparent perspective):

Do I think the solutions is obvious.
Did the paper answer a question I had. Its corollary: does the paper avoid contradicting anything I believe.
Does the contribution solve a problem I care about.
Do the contributions result in improvements to standard benchmarks.

I've emphasized the "I"s in theses, as I believe this is where a lot of the problem comes from. Even when my papers are accepted, there is usually reviewer #2 who is dissenting and dismissive. With a cavalier tone and disparagements that can suck the life out of you. The reviewer seeing only themselves, and perhaps how smart their review makes them seem - without care that there is a person behind the paper. A person who has spent months, possibly years on this paper - the results, the experiments, the writing itself, and has been waiting for judgment for months.

One of my favorites is complaints on novelty, as brought up in a recent bit of tweeting (which inspired me to finish this blog post). One of my favorite papers so far, and the one at the most prestigious venues I've gotten a paper to, was creating LZJD at KDD. It was inspired by a 2004 paper that introduced the Normalized Compression Distance (NCD), a method of measuring similarity between arbitrary things via compression algorithms. While my paper was accepted, I did have one reviewer who disparage it as "lacking novelty" - complaining that anyone could have come up with it. Yet no one did come up with it. Despite the many follow up papers to NCD, for at least a decade, no one had created the same idea. This review was simply a list of complaints, with almost platidunal strengths - as if patting me on the head for trying hard.

While they clearly thought it was obvious at LZJD could be made, they clearly didn't. And neither had anyone else apparently. And maybe the reviewer truly is just better versed in that area, and would have quickly devised my same solution as their first pass. But that doesn't mean it was obvious to everyone, and that doesn't mean it isn't novel.

Sitting on the note of novelty - I also have found frustration at its amorphous definition in the eye's of the reviewers. There is an apparent belief among many that novelty is synonymous with complexity; that simple solution's can't be novel by definition. This is something that has increasingly caused me anxiety as I write papers - as I personally try to strive for simpler solutions. Simpler solutions are easier to implement, to maintain, to debug, to share, to replicate, and to understand. I like simple simple solutions. But current publishing seems to penalize this, and I've found myself on occasion being drawn to the idea of stating things in a more formal or mathematical way - not for any analytical or formal benefit, but because reviewers explicitly ask me to. So if I do it from the start, perhaps they will be happy from the start?

Another fun form of dismissiveness is the not-my-problem problem. That the work isn't interesting because the reviewer doesn't personally have this problem. Its something I think big company research labs are making worse.

For our MalConv paper, one of the early reviews asked us for extensive results testing many different architecture sizes and variants. While this review as a whole was not mean, the ask itself is untenable for our group. Training the model took a month on a DGX-1, a fact included in the paper, and we don't have any more of them! I've got a good amount of computational resources, but the ask of the reviewer would have taken all of my resources years to produce. A research at Google may have such luxuries, but I do not.

A better example this problem (and the 3rd in my bullet list) which includes the callousness I have come to fear came from my swell SHWeL paper, which was extending LZJD. In particular I had come up with the idea to help work around class imbalance problems. This is a big problem for our research in malware detection, as we don't have that much data. Its expensive to get, and companies haven't generally been willing to share. Reviewer #1 was glowing and talking of the paper's high quality. Reviewer #2 voted reject. Stating that the class imbalance wasn't important, and since I didn't deal with the adversarial problem the paper wasn't worth considering. Literally calling my paper good "academics" and stating it was a shame that the LZJD paper was published at all.

I almost cried the night I got that review. I thought it was a really good paper, and I agonized over where to send it and how to best polish it. After, I felt like an imbecile. Why was I doing a PhD? I wasn't worthy - I can't even convince people my work is decent. It felt like there was lead in my chest and it was going to drag me down to hell where I could burn with all the other bad researchers who didn't actually contribute anything.

In none of this am I including the frustration in dealing with reviewers who just don't seem to care or read the paper. It makes things agonizingly painful to know your paper was shot down for falsehoods. From reviewers claiming linear SVMs are equivalent to Lasso regularized Logistic Regression, to reviewers saying I should add a figure to show X when such a figure is labeled "Figure 1" and has word-for-word what they asked for. I've even had a reviewer simply claim my paper had insufficient experiments, when 5 pages were dedicated to every experiment I could find from prior works. It's a situation you just can't win.

This blog post is getting on the long side. So I'll wrap it up now. But I want to implore everyone who is ever reviewing: be polite and be constructive. I've had good negative reviews. Reviews that discussed the strengths of my paper and suggested how I could further improve them. Not simply listing off a list of sins I may have committed. All of us are trying to contribute something we felt was valuable enough to down into black and white.

To those who haven't published before: good luck, and I hope you have a good mentor! I want this post to prepare you for the worst of it, allowing you to better enjoy the good parts! I've felt the process a net positive in my life so far. With my defense done I have no career need to keep publishing, so I hope my continued effort to do so serves as an indicator that it is worth it overall.

Machine Learning with JSAT

Sunday, March 11, 2018

Reflecting on two years of Academic Publishing & Reviewer #2

About Me