The three types of research papers and how I learned to recognise them

After 15+ years of reading, writing, presenting, reviewing, selecting, discussing, stapling, and doodling on the margins of papers I have concluded that there exist three large families of research papers:


About papers are usually written, read, discussed, championed, sent as attachments by people that care about an area. They love the area so much such that everything that has anything to do with it immediately becomes an interesting read. You can write an about-paper about a dataset that you managed to get hold of, about a trial you ran with real users, about the latest research infrastructure that you are developing, or about your favourite new technology. I love reading well written about-papers. I just prefer reading them in magazines, news papers, blogs, newsletters, etc. I definitely don’t like waking up in the middle of the night to review them, especially in weekends and during holidays. The hallmark of an about-paper is its general interest about the area and its relative disinterest about specific contributions and questions in this area.


A concept is a magic lens through which complex things become simple helping us to finally understand them. Think of price of anarchy, differential privacy, betweenness centrality, power-usage efficiency. Great concept-papers can have a profound positive impact in our understanding of the world. They cut across areas and problems and reveal underlying hidden truths and structures. Unfortunately, most concept papers are not of the great type. It’s really tempting to think that you’ve come across the silver bullet that will pierce through any type of steel and concrete. Bad concept-papers confuse and distract. Instead of being a means, they become an end to themselves. In the process they distract our attention from real problems and waste huge amounts of time. The easiest way to write the wrong concept-paper is to believe too much in genius and divine intervention. Despite being more than welcome, neither the first nor the second are strict pre-requisites for a concept-paper. Experience and domain expertise is often all it takes to come up with a great concept after having observed a common structure across different fields and problems. A special case of concept-paper craziness is the technology-concept-paper. Using bit-torrent to send people to Mars, bitcoin to cure cancer, and tcp to alleviate traffic jams in Beijing.


  • Is location-based price discrimination happening in e-commerce?
  • Which advertisers place targeted ads driven by sensitive personal data?
  • How much cross-subsidization exists between heavy and light consumers of residential broadband?
  • What percentage of online advertising revenues go to fake clicks?
  • Who starts fake news campaigns in social media?
  • Can we build sub 10ms delay networks?”.

Questions-papers are all about answering a clear and easy to understand question about something that is important and hard to guess without doing some work first. Surely you can find questions in both about- and concept-papers. The difference is that in question-papers it is the question that leads the entire effort as opposed to taking the back seat as in the other two. A clear and important question is an infallible compass for finding your way among the myriads of alternatives arising during any research effort. Putting the question on the driver’s seat makes everything else fall easily in place: the dataset that you need, the expertise required for answering it, the right definition, the right algorithm, the right system, the results to show.

Over the years I have written papers of all three types but I must admit that lately I only care about question-papers. I would love to write a good concept paper in the area where I currently work but I am afraid I still have some question to ask and answer before being ready to do so,

There I said it: The Net Neutrality “debate” is the Climate Change “debate” of the Internet

i.e., it’s no debate at all, not a serious one at least.

It’s more of a huge mess created from taking the good intentions of well meaning people and twisting them to fit economic and other interests.

I could stop here and feel relieved that I finally expressed that which has been haunting me after having spent 3 years on a single paper on the economics of interconnection.

It was actually my frustration from failing to explain the data and economic concepts used in this paper that made me give up on the area and turn to privacy and transparency (in this regard “thank you network neutrality camp!”).

I just couldn’t explain the rather obvious economic fact that a super market can’t function efficiently if the cashier just weigh how much stuff you get and charge you the same independently if you are taking 5 kilos of potatos or 5 kilos of white truffle.

I am not planning to write a full essay on this (yet) so here goes in almost random order my take aways from these 3 years:

1. Connectivity is a two-sided market

When you are campaigning, writing, lobbying, yelling about network neutrality you actually fight for the right of people to keep paying for 100% of interconnection (aka last mile) costs. You may as well fight for your right to pay a higher price for passes for the MWC so that exhibitors can get free booths, or for your right to pay a higher price for magazines so that advertisers can place ads for free.

2. “What about the  little guy?”

Many seem to worry about the small startup without the “deep pockets” that won’t be able to afford paying for the “fast lane”. Well, the little guy has little traffic and therefore doesn’t need to have deep pockets in the first place. If ISPs want to deconstruct this rather superficial argument they should just offer the fast lane for free to small companies and only charge if traffic (aka business) ramps up. This would give the little guy a competitive advantage over established service monopolies/oligopolies that frankly, would be the ones to be challenged the most by a change in the current status quo.

3. “Throttling”

A heavy word. Implying that certain traffic will be delayed so that other traffic can take priority over it. This implies a “zero-sum” Internet in which if I have more leg room (like in business class) you are not able to open up your laptop or reach for the fork. The internet is not zero-sum. If companies want to push HD video or gaming traffic they can go through different lines and ports so that you don’t have to hang up your call to your grandma.

4. Internet Democracy, Freedom of Speech, etc

This is where it all began (the “good intentions” I mentioned earlier) but it’s not about this anymore. Any attempt to exploit non neutrality for something in this area would be suicidal for the brand image of any commercial organisation. Despotic regimes and tyrants don’t care about half measures like non neutrality – they go straight to good old blocking.

5. “Harm innovation” and other “arguments”

When someone resorts to “harm innovation” it means that he is already loosing the “debate” so he has to take out the H-bomb. Usually this comes after “Internet Democracy, Freedom of Speech”, i.e., when 4. fails to convince. The “Harm innovation” “argument” is so fuzzy and ethereal as an “argument” that is indeed difficult to deconstruct. It’s like trying to shoot at a ghost. You’ll never get it. I’ll just say that “business innovation on the web” is not the only innovation in and around technology.

I’ll stop here for now.

DTL Award Grants’17 announced!

Proud to announce the DTL Award Grant winners for 2017. Our latest batch of funded projects covers new and upcoming areas in data transparency such as: Detection of Algorithmic Bias, Location Privacy, Privacy in Home IoT Devices, Online-Offline Data Fusion, and others. Full list here.

Congratulations to the winners and a big thanks to everyone that participated in the program.

A brief farewell after 10 years

I read Malcolm Gladwell’s Outliers little after I joined Telefonica as a researcher 10 years ago. He said that it takes approximately 10000 hours or 10 years to become good at something. I was on my way back from US. I was leaving academia after 10+ years and I was heading to my first real job. I had a recent PhD, a good number of publications, and a growing number of scientific collaborations under my belt. My viewpoint on things was more or less as follows:

— A Network was a Graph

— Competition was a Strategic Game

— Investment was a Facility Location problem

— Complexity was Combinatorial

— A good solution had to be non trivial

Today is my last day with Telefonica.

— A Network is a mindboggling mess of cables, boxes, buildings, antennas, people, and companies that run around it like bees to keep it running. It’s so amazing that it works most of the time. Nobody fully understands why.

— Competition is an even worse monster. Companies collaborate in one place and compete in the another. They are friends today and enemies tomorrow. Regulation, public opinion, and random events can change the game from one day to the next. Good luck trying to make sense of it through Game Theory.

— Cost structures, CAPEX, OPEX are so complex that even producing an accurate bill of something like our total electricity consumption is a highly non-trivial task. You can try to optimize one thing here only to find out that you are breaking 10 things there.

— Complexity is still combinatorial but not so much on the number of links but on the number of business units, business models, and on the number of assumption that one makes about the future.

— It’s really great when a solution is trivial

Telefonica has been a great school for me. I saw more cable than I even believed existed. I switched off rows of modems to save energy and got surprised to see the remaining ones locking at a higher bit-rate due to reduced crosstalk. I participated in building real stuff, from CDNs to WIFI aggregators, and from ride-sharing systems, to browser addons for privacy. I was given access to tons of numbers about how much things cost and how much traffic goes through them. Got to work with regulators, investment planers, strategy departments, innovation departments, communication and PR people, HR, and almost any other specialty that you can imagine. I was allowed to create an NGO.

Throughout all this I managed to remain a researcher. I don’t know if I managed to fit Gladwell’s predictions, but I am sure I stand way more firm on my feet today than when I walked in. I am grateful for the great opportunities I was given and for the wonderful people that I got to work with these 10 years.

Telefonica is a fantastic place for any young researcher that wants to take a walk on the real side of things.

Via Augusta, circa 2010.

10 thoughts that stuck with me after attending a data protection event almost every month for the last two years

1. Privacy is not hype.

The uncontrolled erosion of privacy is not a “victimless crime”. The cost is just shifted to the future. Could be paid tomorrow — an offending ad — or in a few years — a record of your assumed health status leaking to an insurance company.

2. People don’t currently care much about it but this can change fast.

Indeed people don’t seem to care that much right now, certainly not enough to give up any of the conveniences of the web. But nothing about it is written on stone. Some other things that people didn’t use to care about: smoking, car safety, airport security, dangerous toys, racial or sexual discrimination. Societies evolve … privacy discussions and debates have started reaching the wider public.

3. Privacy problems will only get worse.

Privacy vs. web business models is a textbook example of a Tragedy of the Commons. The financial temptation is just too great to be ignored, especially by companies that have nothing to risk or loose. Just find a niche for data that somebody would be willing to pay good money and go for it. Even if all the big companies play totally by the book, there’s still a long tail of thousands of medium to small trackers/data aggregators that can destroy consumer and regulator trust.

4. The web can actually break due to (lack of) privacy.

The web as big and successful as it is, is not indestructible. It too can fall from grace. Other media that once were king are no longer. News papers and TV are nowhere near their prior glory. Loss of trust is the Achilles’ heel of the web.

5. Privacy extremism or wishful thinking are not doing anybody any good.

Extremists at both sides of the spectrum are not doing anybody any good. Stopping all leakage is both impossible and unnecessary. Similarly, believing that the market will magically find its way without any special effort or care is wishful thinking. There are complex tradeoffs in the area to be confronted. That’s fine and nothing new really. Our societies have dealt with similar situations again and again in the past. From financial systems, to transportation, and medicine, there are always practical solutions for maximizing the societal benefits while minimising the risks for individuals. They just take time and effort before they can be reached with lots of trial and error along the way.

6. Complex technology can only be tamed by other, equally advanced, technology.

Regulation and self-regulation have a critical role in the area but are effectively helpless without specialised technology for auditing and testing for compliance, whether pro-actively or reactively. Have you lately taken your car to service? What did you see? A mechanic nowadays is merely connecting a computer to another that checks it by running a barrage of tests. Then he analyses and interpretes the results. A doctor is doing a similar thing but for humans. If the modern mechanic and doctor depend on technology for their daily job, why should a lawyer or a judge be left alone to make sense of privacy and data protection on the internet only with paper and a briefcase at hand?

7. Transparency software is the catalyst for trusting again the web.

Transparency software is the catalyser that can empower regulators and DPAs while creating the right incentives and market pressures to expedite the market convergence to a win-win state for all. But hold on a second … What is this “Transparency software”? Well it’s just what its name suggest. Simple to use software for checking (aha “transparency”) for information uses that users or regulators dont like. You know things like targeting minors online, targeting ads to patients, making arbitrary assumptions about one’s political, religious beliefs, or sexual preference.

A simple but fundamental idea here is that since it is virtually impossible to stop all information leakage (this would break the web faster than privacy), we can try to reduce it and then keep an open eye for controversial practices. A second important idea is to resist the temptation of finding holistic solutions and instead start working on specific data protection problems in given contexts. Context can improve many of our discussions and lead to tangible results faster and easier. If such tangible results don’t start showing up in the foreseeable future its only natural to expect that everyone will eventually be exhausted and give up the whole privacy and data protection matter altogether. Therefore why dont we start interleaving in our abstract discussions some more grounded ones. Pick up one application/service at a time, see what (if anything) is annoying people about it, and fix it. Solving specific issues in specific contexts is not as glamorous as magic general solutions but guess what — we can solve PII leakage issues in a specific website in a matter of hours and we can come up with tools to detect PII leakages in six months to a year, whereas coming up with a general purpose solution for all matters of privacy may take too long.

8. Transparency works. Ask the telcos about Network Neutrality.

Transparency has in the past proved to be quite effective. Indeed, almost a decade ago the Network Neutrality debate was ignited by reports that some Telcos were using Deep Packet Inspection (DPI) equipment to delay or block certain types of traffic, such as peer-to-peer (P2P) traffic from BitTorrent and other protocols. Unnoticed among scores of public statements and discussions, groups of computer scientists started building simple to use tools to check whether a broadband connection was being subjected to P2P blocking. Similarly, tools were built to test whether a broadband connection matched the advertised speed. All a user had to do to check whether his ISP was blocking BitTorrent was to visit a site and click on a button that launches a series of test and … voila. Verifying actual broadband speeds was made equally simple. The existence of such easy to use tools seems to have created the right incentives for Telcos to avoid blocking while making sure they deliver on speed promises.

9. Market, self-regulation, and regulation, in that order.

Most of the work for fixing data protection problems should be undertaken by the market. Regulators should alway be there to raise the bottom line and scare the bad guys. Independent audit makes sure self regulation is effective. It gives it more credibility since it can be checked by independent parties that it delivers on its promises.

10. The tools are not going to build themselves. Get busy!

Building the tools is not easy. Are we prepared? Do we have enough people with the necessary skills to build such tools? Questionable. Our $heriff tool for detecting online price discrimination took more than 2 years and very hard work from some very talented and committed PhD students and researchers. Similarly for our new eyeWnder tool for detecting behavioural targeting. Luckily the Data Transparency Lab community of builders is growing fast. Keep an eye for our forthcoming call for proposals and submit your ideas.

“Oh … but people don’t care about privacy”

If only I had a penny for every time I’ve heard this aphorism!

True, most typology studies out there as well as our own experiences verify that currently most of us act like the kids that rush to the table and grab the candy in the classic delayed gratification marshmallow experiment: convenience rules over our privacy concerns.

But nothing is written in stone about this. Given enough information and some time to digest it, even greedy kids learn. Just take a look at some other things we didn’t use to care about:

Airport security

Never had the pleasure of walking directly into a plane without a security check but from what I hear there was a time that this was how it worked. You would show up at the airport with ticket at hand. The check-in assistant would verify that your name is on the list and check your id. Then you would just walk past the security officer and go directly to the boarding gate. Simple as that.

Then came hijackers and ruined everything. Between 1968 and 1972, hijackers took over a commercial aircraft every other week, on average. So long with speedy boarding and farewell to smoking on planes 20 years later. If you want to get nostalgic, here you go:


Since we are in the topic of smoking and given that lots of privacy concerns are caused by personal data collection practices in online advertising I cannot avoid thinking of Betty and Don Draper with cigarettes at hand at work, in the car, or even at home with the kids.


To be honest I don’t have to go as far as the Mad Men heroes to draw examples. I am pretty, pretty, pretty sure I’ve seen some of this in real life.

Dangerous toys

Where do I start here? I could list some of my own but they are nowhere near as fun as some that I discovered with a quick search around the web. Things like:

  • Glass blowing kit
  • Lead casting kit
  • Working electric power tools for kids
  • The kerosine train
  • Magic killer guns that impress, burn, or knock down your friends.
Power tools for junior
Power tools for junior

Pictures are louder than words. Just take a look at The 8 Most Wildly Irresponsible Vintage Toys. Last in this list is the “Atomic Energy Lab” which brings us to:

Recreational uses of radio active materials

I love micro-mechanics and there’s nothing more lovable about it than mechanical watches. There is a magic in listening to the ticking sound of a mechanical movement while observing the seconds hand sweep smoothly above the dial. You can even do it the dark because modern watches use super luminova to illuminate watch dial markings and hands.

But it was not always like that. Before super luminova watches used Tritium and before that … Radium.

Swiss Military Watch Commander model with tritium-illuminated face
Swiss Military Watch Commander model with tritium-illuminated face
Radium watch hands under ultraviolet light
Radium watch hands under ultraviolet light

I am stretching dangerously beyond my field here but from what I gather, Tritium, a radio-active material, needs to be handled very carefully. Radium is downright dangerous. I mean “you are going to die” dangerous. Just read a bit about what happened to the “Radium Girls” who used to apply radium on watch dials in an assembly line in the ’20s.

Radium girls
Radium girls

But we are not done yet. Remember the title of the section is “Recreational uses of radio active materials”. Watch dials are just the tip of the iceberg. It’s more of a useful than a recreational thing to be able to read the time in the dark (with some exceptions). Could society stomach the dangers for workers? Who knows? It doesn’t really matter because there are these other uses, that were truly recreational (in the beginning at least) for which I hope the answer is pretty clear. Here goes the list:

  • Radium chocolate
  • Radium water
  • Radium toothpaste
  • Radium spa
Radium Schokolade
Radium Schokolade

Details and imagery at 9 Ways People Used Radium Before We Understood the Risks.

Anyhow, I can go on for hours on this, talk about car safety belts, car seat headrests, balconies, furniture design etc but I think where I am getting at is clear: Societies evolve.

It takes some time and some pain but they evolve. Especially in our time with the ease at which information spreads, they evolve fast. Mark my words, it wont be long before we look back and laugh at the way we approached privacy in the happy days of the web.