Freeing the Lion

Alexandre Oliva

This is a tale of how the young FSFLA, trapped with a Lion, set it Free and enabled every law-abiding person in Brazil to comply with income tax obligations in freedom.

Some historical background

In October last year, we launched FSFLA's campaign against "Softwares Impostos" in Brazil. It's based on an understanding that the Brazilian Federal constitution already demands software required for interaction between the government and citizens and taxpayers to be Free Software.

A primary focus, all the way from the beginning, was software distributed by Receita Federal (AKA RF, the Brazilian IRS) for people to fill in and turn in (over the Internet) their income tax forms. Although many people have an option to fill in these forms by hand, on paper, some are required to fill in and submit these forms electronically, necessarily using programs for MS-Windows or for Sun-Java, distributed by RF. So, you see, "impostos" means both "taxes" and "imposed", which is why the campaign is against "softwares impostos" ;-)
http://www.fsfla.org/?q=en/node/120

The piece referenced below sums up some of the most serious problems with the programs distributed by RF.
http://www.fsfla.org/?q=en/node/122#Editorial

By mid December last year, RF released a beta version of its form-filling software that (as usual) wouldn't run on 100% Free Software platforms, and wasn't itself Free Software. After evaluating our alternatives, we published an article (in Portuguese only) and started direct action towards RF to try to fix that and other problems, such that people could comply with their tax obligations without giving up their freedom and without breaking the law.
http://www.fsfla.org/?q=pt/node/143
http://www.fsfla.org/?q=en/node/145
http://www.fsfla.org/?q=en/node/147
http://www.fsfla.org/?q=en/node/148#1

On March 1st, a non-beta version of the program was released, failing to comply with the law and with our requests. Over the month of March, we've launched a petition detailing our requests, and engaged in negotiations with RF. They published the file formats, such that people can have an idea of what they're turning in, even if writing a program to perform these tasks was still impossible because critical information was missing. RF came up with the idea of "implicit copyright licenses", whose reasoning led us to the idea that we had an implicit license to decompile their program and release it as Free Software. Oh, and by exploding the ZIP and JAR files in IRPF2007, looking for the "implicit copyright license" in there, we found out that RF used a bunch of Free Software libraries in their software, but failed to comply with the licenses of most of them.
http://www.fsfla.org/?q=pt/node/152 (in pt_BR only, unfortunately)
http://www.fsfla.org/?q=en/node/153#3

How to comply with the tax law?

We kept on negotiating with RF in April, but no further progress was made. One of the FSFLA board members, was in a difficult position because the end-of-April deadline was looming, and there was not much he could do. In theory, he had the following options:

  • using the paper form, no later than April 30

  • using IRPF2007 to fill in the tax form, then:

  • turning it in at a bank, no later than April 27

  • using ReceitaNet (another software distributed by RF) to turn it in over the Internet, no later than April 30

  • reverse-engineering IRPF2007 to complete the file format specification, and then turning in a file prepared by hand or using some other piece of software

Using the paper form would break the law, so it was not an option. Using the software distributed by RF was not an option either, because its use, and even its distribution by RF, were illegal. Never mind that its use would require him to give up his moral, philosophical and political beliefs protected by the Brazilian Federal Constitution.

Hiring a third party to run the software was not much of an option, because it would amount to forming a team to commit a crime, in which the hired party would be in it for profit. Both would turn the illegal use into an even more serious crime. And since the government is the copyright holder, this infringement is punishable even in the absence of complaint by the copyright holder.

Reverse-engineering looked like the only legal alternative to breaking the law or filing a very costly lawsuit. But it looked like a lot of work, and time was running short.

Meanwhile, our sister organization Free Software Foundation Europe, through its Freedom Task Force, found out there was another license violation we'd missed: the IRPF2007 installer for GNU/Linux/x86 included code from GNU libc without complying with its license. This was the only infringement on copyrights held by any FSF. It's nothing less than the original FSF that holds the copyrights on GNU libc.

So, on April 10, we got back in touch with RF for further negotations. Once we cleared some press-induced misunderstandigs and presented the evidence of copyright infringement, we asked for action and a few favors. A request for an exception, to turn in the declaration on paper, was denied. Also, RF said it wouldn't be able to publish a license for use of its software before the April 30 deadline, and that its lawyers and technical experts were already looking into the license violations, but they wouldn't be done by the April 30 deadline. Nevertheless, a request for a deadline extension, such that the fixed software could be legally used, was denied.

The request for completing the file format specification with the hashing algorithm was declined, as was the idea of publishing the original source code, even though both could have been required under the constitutional principle of transparency for official acts. The rationale was the nonsensical fear that third parties could publish modified versions, which could confuse gullible users and put them at risk. It was no use to point out that people could already do that and that, in fact, if we wanted to, we could easily publish a "Tax Free" version of IRPF2007.

Reverse engineering

Having been advised to pursue friendly negotiations rather than litigation, we returned to the reverse-engineering efforts. After some very limited success with another Free Java decompiler, we found JODE. After a little bit of patching, it managed to decompile all the software authored by Serpro for Receita Federal's IRPF2007, almost all of it into very readable source code, and almost all of it perfectly recompilable.
http://jode.sf.net/

But understanding what exactly was fed to the "secret" Crc32 hashing algorithm proved to be tricky. Comments and documentation, missing from the decompiled sources but certainly present in the orignal source code, would probably have helped.

But then it hit him. RF couldn't possibly be demanding every taxpayer to jump through these hoops in order to comply with the law, including law-mandated tax obligations, while at the same time it was breaking the law itself.

There had to be another way, and there was! Right there, staring at him since he started the license compliance investigation, was a copy of the GNU LGPL version 2, in the root directory of IRPF.jar. All of a sudden, everything became clear. That was the "implicit license"!

There were other license files in there, but it was obvious that they were not applicable: one of them was an old Jasper Reports-specific license, and the other, the Apache 1.1 license, was in a sub-directory, so clearly it wasn't meant to apply to everything. It had to be the LGPL.

Free Software License, non-Free Software?!?

So there, he had a license to run the program, and it was a Free Software license. But... where was the source code that should have accompanied the binary, per the included license, to make it Free Software? Well, it was odd, but certainly not illegal, for RF to distribute its own code under the LGPL without offering source code, because the requirement to offer source code applies to licensees, not to the licensor.

And then, RF had forgotten to offer source code of the LGPLed libraries it used in its software, so it was not at all surprising it had forgotten to publish its own.

Anyhow, since there was after all a license, requiring others to use the program was no longer like forcing them to break the law. Once we understood RF's code was under the LGPL, RF's requirement to use its own code no longer violated the lawfulness constitutional principle, that it previously appeared to violate on two accounts.

One problem was the apparent lack of a license to use the program. But RF did provide everyone with a license to run, study, modify and distribute the program, with or without modifications.

The distribution and use of the program still broke the law because of the accidental failure to comply with third-parties' copyright licenses, but that was relatively easy to fix, it would just take some will and some time. Theirs, or ours. Unfortunately, RF said it wouldn't be done by the deadline it imposed on us, and it refused to extend the deadline.

The other problem was about our philosophical beliefs that accepting restrictions to our freedoms pertaining to software is immoral, and that imposing such restrictions is unethical. Such beliefs must be respected, according to the constitution, because the requirement wasn't imposed to all. Since we could turn RF's LGPLed code into Free Software by means of decompilation, running it by itself would not be a moral problem for us.

Nevertheless, a number of the other components included in the package were or had become non-Free, because of legal or illegal distribution without source code, or by other license restrictions in a few packages.

Therefore, on both legal and moral grounds, we still couldn't run the complete software: it depended on a non-Free and illegally-distributed components, and even on a non-Free platform. But we knew we could fix that!

We'd have to remove all non-Free components, obtain the source code of RF's LGPLed code, and fix the license compliance problems of the other components, such that running the end result wouldn't break software copyright law, and it wouldn't violate our moral, philosophical or political beliefs. So that was what we had to do.

Freeing the Lion

So, on April 21, the efforts to set the Lion Free got started. The date was fitting, because it was Tiradentes's holiday, and Tiradentes is a Brazilian independence martyr. As for the Lion, that's how Brazilians refer to income tax.

By April 23, the decompiled Java source code had been cleaned up and minimally fixed by hand to enable full re-compilation; source code for all the Free Software libraries had been located, brought into the package and minimally adjusted such that it would compile in a 100% Free Software environment; and the non-Free libraries had been replaced by stubs or removed. It all compiled, and it even run, to the point of displaying the main screen.

But that was all. No surprise, since that's as far as the original version got on 100% Free Software Java Virtual Machines available at the time. This is probably due to some bug in GNU Classpath's implementation of Swing, and it was quite unfortunate to find out that newer versions of the Free Software GUI components used by IRPF2007 didn't work around that bug.

Nevertheless, this combination of software was packaged and published for external testing. Alas, some people claimed it didn't work for them either, not even on non-Free Java platforms. It looked liked we'd taken out too much Windows- and MacOS-specific GUI code.

Debugging this would be very difficult, and the deadline was too close, so a difficult decision had to be made: give up the GUI, and go for a CLI.

That proved to be a great idea. The program stored its data in very easy-to-understand XML files. Editing them by hand was also trivial after running them through some XML processor that added indentation and line breaks.

After a few more hours of hacking, all the essential features needed to fill in an electronic income tax form, and to prepare it for transmission, had been located in the original source code, and provided through the CLI application.

Editing the XML file turned out to be much better than using the click-until-your-hand-hurts GUI version. It was possible to reorder entries (such as goods, debts, dependents, etc) in however way made sense to you, copying and global string replacement worked, and you could make other convenient changes that the limited GUI wouldn't let you.

Nothing like reducing the amount of taxes you had to pay, of course. In fact, you didn't have to compute these numbers yourself: the CLI offered features to use the "recompute" functionality available in the original program, and to check for any errors or missing entries that were required. Perfect!

By early morning on April 25, Alex had his declaration nearly ready to take to the bank, so he published the CLI version used to prepare it and went to bed. He got up at lunch time, took his daughter to school, picked up some information at the bank that he still needed to complete the forms, located functional floppy disk drive and media and copied the complete declaration file to the disk.

At the bank, the floppy disk was read perfectly, but the file was rejected. Uh oh! Houston, we have a problem! How could we possibly figure out what was wrong, if all the information we had was that the file was not a correct declaration file?

Fear... Had the code been decompiled incorrectly, such that the information was not formatted properly, or computed values or even the hashes were wrong?

Nah, nothing that serious. It was just missing the receipt number. Yeah, that's right! The receipt number is generated by IRPF2007 itself, and, because of a silly cut&pasto, the call to add it to the declaration file was missing. Once that was fixed, the file it produced was happily accepted at the bank. A working release of the newly-freed Brazilian tax form-filling software was published, and a press release about it went out early the following morning, in Portuguese. The translations took about a week longer.
http://www.fsfla.org/?q=en/node/157

Reactions

The April 30 deadline came and went. Everything was quiet, except for the cheering in the Free Software community in Brazil and some newcomers at the PSL-Brazil mailing list trying to understand what and why this had happened.

May 1st was a holiday. On May 6, we were informed that Receita Federal had silently published a new version of ReceitaNet and of IRPF2007, no earlier than May 2nd. The IRPF2007 release fixed some (but far from all) of the licensing problems, along with a statement that explicitly permitted any taxpayer to use the program for the sole purpose of filling in income tax forms. Nothing like that for ReceitaNet, though.

This amounted to a number of steps towards complying with our petition: taxpayers now have permission to run the program IRPF2007, to hire third parties that are taxpayers themselves to run it for them, and to modify the program so as to enable them to prepare their tax returns. However, the distribution by RF was still illegal, since it was not in compliance with third-party copyright licenses. And still, no corresponding source code anywhere to be seen.

It was also a major step backwards: the explicit statement attempts to limit the purpose of the program, which is in conflict with software freedom and with the LGPL under which the program still appears to be released. And the class files are now obfuscated, which might render decompilation more difficult. The software itself does not mention any such limitations, though, so it still looks like LGPL is the intended license for software not covered by IRPF-Licenses.txt.

More permissions

After that, we got suggestions that permission to modify the program the way we needed is explicitly granted by Brazilian software law, based on Brazilian copyright law. It states (Art 6º, IV) that integration of a program into an application or operating system, retaining its essential features, is not copyright violation, as long as it is technically indispensable for the user, and it's for exclusive personal use.
http://www.planalto.gov.br/CCIVIL/Leis/L9609.htm

We were also told that the permission to distribute the program was already granted by the regulation that approved the program, even though the program didn't quite comply with the Java platform requirements specified in the regulation itself.
http://www.receita.fazenda.gov.br/Legislacao/Ins/2007/in7192007.htm

Take these two permissions together and it doesn't even look like we needed the LGPL license to be able to publish the CLI tool along with source code. But, hey!, not complaining. LGPL is not the best Free Software license, but it's much better than any non-Free Software license.

Security threat?

We understand this second permission applies to any form in which the program can be represented, including the source code obtained by means of trivial mechanical translation. So we don't quite understand RF's fear of publishing the source code, its decision to obfuscate the binaries, or even its recent allegations to the press (yet to be published) that releasing the source code would amount to a major security threat.

Surely, if it's so important to keep it secret, they shouldn't be publishing it in the first place, in any form whatsoever. Not even in obfuscated binary code!

They ought to take some security classes and learn that they can only entrust with their secrets (and security needs) code that they can inspect, and that they run themselves, on their own machines. (Maybe it's because they don't realize this that they don't understand why we care so much about being allowed to inspect the code we run on our machines.)

Once they publish their secrets in a freely redistributable form, they're publicly-available knowledge, even if it takes a lot of effort for someone to figure out what that knowledge was. Even more so when it's a no-effort task such as decompiling unobfuscated Java binaries containing debug information, like the original release of IRPF2007.

Lesson #0 of computer security is that obscurity does not provide for security. At most, it can make what's already secure slightly more secure. If it even makes sense to say more secure, given that it implies that what is "already secure" isn't secure in the first place.

If publishing the source code exploses any major security threat, for how long have they been negligently under this threat? Are they going to try to shift the blame onto the whistle-blower? Time will tell...


Copyright 2007 Alexandre Oliva

Permission is granted to make and distribute verbatim copies of this entire document without royalty provided the copyright notice, the document's official URL, and this permission notice are preserved.