Sunday, January 30, 2005

DRM & ebooks

I have been following DRM doscussion on Manning publisher's blog, and dropped my own comment in
A note on Safari: it is not a competition to your current method of distribution at all (even if they promote it as such). Their books are TOO annoying to reas AS books. What Safari is occasionally useful for is looking things up, sort of MSDN for open source technologies. Whether that justifies the pricetag I will leave to my employer, but they are not ebooks.
Another note on piracy. People who stuff their HDs with pirated ebooks are not lost customers (which is not the case with movies) - the effort to actually read a book is much greater. So while I am not saying that piracy is a big problem, I am suggesting that it cannot be judged by the traffic of pirated material on the networks.
Lastly, piracy's role as advertising should not be ignored. Software companies have long known this fact and incorporated it into the way they persecute violators. I occasionally use the networks as the "ultimate try and by" mechanizm - I downloaded a physics lecture from ripped from a CD and that led to over a thousand $ of business for them.

Saturday, January 29, 2005

Dude, where is my car?

Last tuesday this guy decided to park his freakin' car right in front of my driveway. I had no choice but to dig around his car to get out. But what should I do with all the extra snow...

Tuesday, January 25, 2005

more stupid religion stuff

This is a another great candidate for Darwin awards. They should have learned from these guys. It's 21st century, but stupidity marches on!

Monday, January 24, 2005

goolge this, BG!

Well, whatever you think of the google browser rumors, I think it is good for those who do not want MS to take over the entire world to join forces. I only wish Apple got more on board and put itself behind Firefox, the way Google is doing. Not to say Safari ain't cool; I just think they should add their improvements via Firefox codebase. This will keep MS busy while Apple is biting (pun intended) into the PC market share.
I venture to say that if there is emotional weakness at MS it is about market share. Where other companies would occasionally find it smart to yield to a competitor in a specific area and consolidate their strengths I think MS would feel hurt both in terms of public perception and because of their leveraging strategies, where everything is *also* a means for something else. So yeah, hitting them hard in the browser area should do some damage.

Sunday, January 23, 2005

wanna stone the devil?

Well, this will should get the darwin awards in the religious nonsense category.

Guido @amazon devcon

These are interesting summaries of what Python was/is

Saturday, January 22, 2005

what happens if...

You put a bunch of thugs, thievs, idiots and smirfs in the same building and give some of the most inept vetor power? You got it, the UN! Read about some of their accomplishments here.

Thursday, January 20, 2005

calculus Flash-back

In case you forgot your calculus you can get it back, in a Flash (animation!)

apprentice redux

So, the Don is back. This time , there is a twist: "booksmarts" (hereafter BS) are competing against "streetsmarts" (SM). BSs lost the first round by a small margin, probably statistically insignificant. Don blurped out somthing very predictible about how he has respect for education, but maybe streetsmarts are better. So let me enter a few comments. First, these guys are not all that nerdy. Graduating college qualifies you as a BS for this contest. People who dropped out of college to pursue an opportunity are on the average smarter than someone who goes though college. So maybe you will tell me these BSs are super-nerdy. My impression, except for a couple of them, is that they aren't. And this is coming from a qualified nerd with the score of 81 :). So what do I think? I think there is a lot to be said for "streetsmarts", but I think there is a glass ceiling most of them, and only partially because of preception of others. This is because most businesses these days REQUIRE a pretty deep intellectual understanding of things that someone with just streetsmarts will not have. E.g. a lot of competitive differentiation in industries is technological. I also think that it was not always so. I think in the "old days" people with streetsmarts had a much better chance to ride on some success mixed with luck untill they learn their business well. But not these days. Just look at the CEOs of Fortune 500 companies, not many SMs there.

Tuesday, January 18, 2005

I wish I could be this sexy...

Can I get this in a poster form?

Monday, January 17, 2005

your XML options in Python

Here is a really good coverage. Not to sound corny or anything on this Martin Luther King day, but I will make the observation that 2 of the best-known python/XML experts, Uche Ogbuji and Paul Prescod, are in fact obviously black. I am sure the redneck idiots can have a theory for this one.

Thursday, January 13, 2005

google and common words

This is kinda interesting: I was curious to know how many PDFs google indexes. One simple way to find out is to look for a common word, so I tried "the filetype:pdf". The result - nada! Of course this has an obvious explaination, according to Shannon: the information content of a symbol is inversely proportional to the probability of occurence. And since "the" is the most common word in the English language, it is THE most meaningless. After a little googling I saw them say so themselves in the Automatic Exclusion of Common Words section. Being a curious monkey I decided not to take their word for it, and got some interesting results. Yes, they do not let you search on "the" in PDF, but allow it in HTML search despite their own disclaimer. Possible explaination is that they are changing their policy (so that people can find this) and have not updated their PDF index yet. But I dug deeper and realized that they have allowed "the" for quite a while, plenty of time to update their PDF index. Then it came to me: google index treats each HTML page as a single document AND every PDF FILE as a single document. Since PDF files are on average significantly longer than an HTML page, the probability of "the" in the document is greatly increased, making the "the" in PDF that much more meaningless than "the" in HTML. So how do I know how many PDFs google indexed? I just keep going down the list:

the nada
of nada
to nada
and nada
a nope
in ;(
s -
it -
you bingo!
about 22M files.
I think this is very close to the total number of PDFs in their index, certainly within an order of magnitude.

Well, Bill Gates claims that strong IP is responsible for the success of American capitalism. I cannot argue that he is partially correct. But he has a lot of stake in saying what he is saying, so let's take a closer look. Many current objections to IP are about where IP is going, not necessarily where it was (which is the time period when many business successes Bill is talking about emerged). Another issue is the cost of IP.

Wednesday, January 12, 2005

still diverting

I am slowly munging through my diversion, about 190 pages into it. In case you want a quick summary of some of the main points, you can get it from Joel, who recommended the book in the first place.

HP is still doing cool research

As someone said, future is here, it's just not evenly distributed (yet). I am talking about media search. Previous attempts to do media search relied either on metadata, which is very poor or on SAP for TV feeds, which is much better that metadata, but still somewhat poor in relation to the content it represents. If you have just a little more resources than the rest, you can take a very resource-intensive technology such as speech recognition and and build a search engine on top of it. And since it's on the internet... does it mean it IS evenly distributed?

Thursday, January 06, 2005

directory iteration

Ever heard of a cool way to do directory iteration in Windoze? You can do it in Python, too. Your pick.......