Saturday, January 29, 2005

All your cable modems combined, I am Captain Pirate!

I read this interesting article and this one on music during one of my quests into the world wide web. That article reflects what I had already thought could be the case with P2P and file sharing but which I dismissed as too radical a view on the issue. Besides, being on the other side of the law, I thought maybe I am a little too biased for the topic. But now I can rest easier knowing this view was actually proved right.

I used to think that the software and the media industries are not losing money at all because of P2P file sharing or even pirated CDs. I have even read many interviews in DIGIT on how various people are moving to original version software instead of the bootlegged one in order to avoid the headache of missing features, files, etc. So people who want to buy an original version and have the money to buy it still buys it. Those who want the software but don't/can't pay for it, swaps it with the peers on the Internet. So essentially, there should be no effect of Peer-to-Peer pirating on the sales of such products. But the company quarterly postings report severe losses, mainly attributed to P2P. What's going on? Which figure is lying?

I don't know much about this, but my two cents thought says that I believe that the music and software just lost the new buyers of their products who would have liked the product, had the resource to do so but would have preferred to have the copy while idling in their chair.


There can be no doubt that P2P has started receiving all the lime-light. New P2P architectures and applications are aiding more faster, comprehensive, up-to-date and anonymous file sharing. In such a scenario, existing copyright technologies fail to prevent illegal copies to be made. The media and software industries are boiling over watching their products being swapped between countless users without paying royalty. Sure, there were a few arrests and cease-and-desists but that's not going to stop another P2P technique from providing more anonymity to the bootlegger. The cat-and-mouse chase can go on forever, with the media cops catching up with the bootleggers somehow, but can't this be avoided? Why is it happenning?

A long time ago, when men were men, women were women, small furry creatures from alpha-centauri were small furry creatures from alpha-centauri and Objects were real, physical objects, the concept of object ownership was simple: Anything that you created, grew, planted or staked was yours. Period. Everyone was happy. The birds chirped. Leaves rustled and Life was beautiful. That is, until someone came up with mangled-looking scrawls called "Programs" that would inevitably crash big boxes called "Computers". If the ancient law of ownership, that made people's life oh-so-happy, were to be applied to them programs, no company could have made a single dollar. This was the time when security people can get away saying things like floppies and "Dongles" without getting wierd looks from other people. This was a time when the Internet was an obscure science-fiction rumored to be mentioned at some big university. This was a time when file sharing involved two environmentalists writing on a single sheet of paper.

Times definetly changed and here we are with the same old copyright protection laws offering a paper-thin resistance against copying when multi-megabytes can be downloaded and uploaded for million others to download and upload. P2P and other disruptive technologies may have made many illegal software owners.

In a recent article I read, the cops in United States were having a problem: Illegal aliens (foreigners, incase you let your imagination wander) are at large and their huge number makes for a large suspect pool for the cops to investigate everytime a crime occurred. The security advisors came up with an utterly surprising and clever plan: Instead of ignoring the presence of illegal aliens among us, let us recognise them and provide them with alien driving licences. Now this may come as a shock to a lot, but the idea has a clever base to it. Since the aliens can obtain a licence, the good people of the lot will come forward and let themselves be testified as good citizens. The remaining of the lot will contain the bad eggs. This will atleast reduce the number of fake driving licences and other documents and at the same time give the good guys a chance to prove themselves.

Applying the same logic to our case, instead of trying to ignore the fact that illegal software owners will be around always, why not come up with copyright laws that acknowledge such users' presence and invent some kind of compromise-plan to keep their numbers at bay? I am not sure if such a scheme would be the best, but it sure seems a good change of perspective to me. I think Microsoft has come up with something like that in distributing future software updates to its Operating Systems.


PS: Okay, the title of this post doesn't have anything to do with the content, but it seemed cool :)

Thursday, January 06, 2005

BitTorrent, the slayer.

The first time I heard of bittorrent, some unthinkable months ago, I was grinning so broadly my grinning muscles hurt. It was as if some mythical demon that was unslayable, was slain magically by the words that I was reading. How rude of me! Let me introduce that demon to you.
Meet Dm.Redundant Packets (Dm = Demon ;))
Mr. Redundant Packets was a mythical kind since he was a demon only in my head. I saw him whenever a server has to send the same data packets to multiple hosts, even if the hosts were requesting the packets simulataneously. The demon became really stank when a server buckled trying to send the same file to too many hosts. It was quite horrifying to watch all this, but since I could think of no solutions, I was but a mute spectator.

A lecture that introduced me to "Multi-casting", a form of "Broadcasting" but within a predefined set of exclusive hosts. These are exclusive hosts since in order to be included in a multicast group, you need to pay some money and register your "group" under a multicast IP address. This is because, as opposed to broadcasting where every Jack connected to the sender should be sent a copy of the same packet, in multicast only one copy of the data is sent to the multicast group. The multicast IP address (Class D : 224.0.0.0 to 239.255.255.255) is needed to identify the group since the multicast happens at the router level. Thoug my level of knowledge stops here with multicast servers and MBones, I knew enough to belive that the demon wasn't dead, yet.

Joining multicast groups is costly and this is something which can't be done every time a server wants to send data to multiple temporary hosts.

Enter BitTorrent.

I was reading about bittorrent some months back and realized its potential as a new distribution system. Though I always feel this way with every new P2P system, mainly because P2P architectures excite me, it was different this time. That is because bittorrent smoothes one major inevitablity in P2P systems: freeloaders. Simply put, FreeLoaders are those who downloads but not shares. P2P systems adapted to eliminate this problem. Kazaa implemented a "Participation Level" counter which says to other hosts of the number of files being shared by the user. It worked somewhat, but was not enough.

BitTorrent follows the "Give And Ye Shall Receive" mantra. What BT does is, it splits files into samll equal sized pieces and lets people download the pieces from other people who have already downloaded it. This way, the original "Swarm" doesn't have to upload more than that file's size worth of it. To ensure that the pieces were legitemate, BT uses SHA1 hashing to hash each piece into a 160 bit (20Byte) hash string. This and other meta information are present in the Torrent file which is made available in the public domain. The actual file itself is spread over the world, literally. If anyone is downloading the file in question, you are sure to get it because by downloading, you ensure that future prospectors of the file will get it. But he will get it as long as someone is seeding it and/or downloading it.

Unfortuantely, this property of BitTorrent means that, only popular content remain downloadable via BitTorrent. If the original source of the file exists, it has to be found and downloaded in a client-server like manner.

But fortunately, there are many applications that follow this eccicentric business model. Television shows, blog posts, podcasts, CVS or bleeding edge codes to name some off the top of my mind. Content distribution via BitTorrent is not only fast and cheap but also far reaching. Let me quote one article in Wired that I read recently:

Evidence that Burnham's prediction is coming true came a few weeks before the US presidential election in November, when Jon Stewart - host of Comedy Central's irreverent The Daily Show - made a now-famous appearance on CNN's Crossfire. Stewart attacked the hosts, Paul Begala and Tucker Carlson, calling them political puppets. "What you do is partisan hackery," he said, just before he called Carlson "a dick." Amusing enough, but what happened next was more remarkable. Delighted fans immediately ripped the segment and posted it online as a torrent. Word of Stewart's smackdown spread rapidly through the blogs, and within a day at least 4,000 servers were hosting the clip. One host reported having, at any given time, more than a hundred peers swapping and downloading the file. No one knows exactly how many people got the clip through BitTorrent, but this kind of traffic on the very first day suggests a number in the hundreds of thousands - and probably much higher. Another 2.3 million people streamed it from iFilm.com over the next few weeks. By contrast, CNN's audience for Crossfire was only 867,000. Three times as many people saw Stewart's appearance online as on CNN itself.

This method of broadcasting a content via BitTorrent, called PeerCasting, is being regarded as a nemesis by the distribution companies. Why? Imagine television with advertisements ripped-off, unnecessary scenes and credits cut-off and nil cost of distributing via huge TV antennas and satellite dishes. That is the nightmare distribution companies have when they want to dream about their future. But what does that mean for the movie/Television industry of the future? More self-produced shows, techie science shows that were super-ceded by fashion pageantries and such shows may increase and be sustained by blogs and emails who patronize them. These will, ofcourse, exist side-by-side oridinary sequential television for some time atleast. Maybe some other technique will take the throne?

I along with Sriram krishnan and Balakrishnan, who are my class-mates and close-friends, are currently working on integrating BitTorrent distribution model into blogging websites. By observing how blog servers were buckling under the current text blog load, we felt that the bandwidth required to serve hundreds of podcasts and video casts, in the future, will drive the blog servers out of money and out of the picture. We wanted to try fitting BitTorrent like tracker server transparently without modifying the existing blogging platform. Kinda like what Coral did. Seems like not many attempted this but rather have been successful in replacing servers with BitTorrent trackers. The trouble we thought we would face if we attampted that was that nobody will replace their servers just because a few undergrads asked them to. BTW, the project is called "Smoke" and since a better and in-depth description of it is available at Sriram Krishnan's website, I shall refrain from repeating it.


PS: During the project proposal phase of our smoke project, I spent a full five minutes writing one sentence to my fellow mates on, as it turned out, a varied topics. Check it out:

Don't bother printing, even if you have already done so, as I have changed the definition of "slashdotting" and changed the constant values and have printed it out, which is because of my lateness in answering to your late response to my earlier request to type out this project proposal as an adaptation of sriram's description of our supposedly finalized final year project involving two of my favourite topics: BitTorrent and the web, which I suggest are my favourite topics because of their simplicity and effectiveness and, last but not in anyway the least, their inherent beauty of which I am captivated the most, I should say, as will be expressed in my next blog post titled"BitTorrent, the Slayer", named because it (bittorrent) hath slain the demon of a problem namely "redundant packets" that any server running the currently old Client-Server architecture can experience due to its inherent nature of not using the client's cache and the originally small files for which the client-server architecture was developed forand which it served well and, hopefully, will serve well for quite some time into the future, that is, until our or rather a BitTorrent like architecture, supercedes it - an event I sure wish I was around to spectate - an event I am sure will happen never minding what people might tell you just because there are softwares that only support client-server models around and because companies know how to compute Cost-Benefit which will show that the company will be better off adopting the BT distribution model and eliminate nasty bandwidth requirements and slashdotting effects than wastefully serving the same file simultaneously to multiple clients, while that bandwidth could have been used, say, to launch multiple new servers all running the BT-like distribution model, which excites me so much since we are too developing one such solution which might get popular if it is really revolutionary in such a way as this seemingly endless, and not to mention surprisingly techie, post that started out as a simple acknowledgement post and ended(not quite) as a "dotless" , some what techy and hence not just a non-sensical rant usually just created for the sake of typing out such a monstrosity of a sentence which no one in their right minds would try to follow after the first few lines which indicate the shameful intent of the author.

So long (and thanks for all the files)!

Tuesday, January 04, 2005

Think you know about bootlegging?

Hell, even I thought bootlegging (pirating) was just buying a cd and sharing its rip. But this article from Wired blew my mind. Here are a few excerpts:

"In reality, the number of files on the Net ripped from store-bought CDs, DVDs, and videogames is statistically negligible. People don't share what they buy; they share what is already being shared - the countless descendants of a single "Adam and Eve" file"

"Whoever transfers the most files to the most sites in the least amount of time wins. There are elaborate rules, with prizes in the offing and reputations at stake. Topsites like Anathema are at the apex. Once a file is posted to a topsite, it starts a rapid descent through wider and wider levels of an invisible network, multiplying exponentially along the way. At each step, more and more pirates pitch in to keep the avalanche tumbling downward. Finally, thousands, perhaps millions, of copies - all the progeny of that original file - spill into the public peer-to-peer networks: Kazaa, LimeWire, Morpheus. Without this duplication and distribution structure providing content, the P2P networks would run dry."

"The sites use a "bounce" to hide their IP address, and members can log in only from trusted IP addresses already on file. Most transmissions between sites use heavy-duty encryption. Finally, they continually change the usernames and passwords required to log in. Estimates say this media darknet distributes more than half a million movies every day"

Dig deeper and read more about Frank, the hacker who distributed HalfLife2 beta an year before its release.