In the framework of a project for automatic transcription of Broadcast News in French, we needed a corpus of manually transcribed Broadcast News material to train and develop the system. For that purpose, a suitable software environment was required. The LDC already had the experience of creating Broadcast News corpora. However, the software environment they use ran on high-end servers or rely on third-party software. We therefore looked for other solutions (which are detailed in an article presented at the LREC conference; see Question 3 below), but none was fully satisfactory, and we therefore decided to build our own tool, in collaboration with the LDC.
There are several reasons:
To understand better why one might be willing to distribute software freely, you can read about the philosophy of free software (mirror in Europe) promoted by the FSF.