The toothpaste turned out ok. In fact, I think I like it better. The flavor is milder. The tube seemed strange at first, but I think it's a better design. It was made in Germany for a UK branch of an American company and the text on the package is in English and Greek. Maybe I'll buy an extra to bring back to the US to show people.
Microsoft and the US Justice Department have failed to make a deal. Does this mean all the lawyers are sent to the warp?
I'm seeing lots of reminders to set my clocks forward. But we did that here last weekend. Despite all my studying of time standards, I never knew that people did this on different days.
It went to the extra end, but Scotland beat the US in the World Curling Championship. This is the strangest sport I've ever seen. It's ice-shuffleboard, but with people madly sweeping the ice in front of the sliding stone to minutely affect its course. It was on national TV. So was the snooker competition.
Today I rejoined civilization. I got a library card. Not only am I rewarded with borrowing privileges, but also with the most beautiful library card I've ever seen. It's a masterpiece of visual design. Just looking at it makes me happy, even beyond the usual joy a library card evokes.
I remember my first library card. It was stiff orangish paper, mostly blank, smaller than standard cards, with an embossed metal tab used for imprinting. It was flimsy. It wore down. But it was the first card of any kind that was mine. It was eventually replaced by a cluttered plastic card with two bar code stickers added one at a time as the library changed computer systems, expiration dates and home addresses amended in pen, and a Video sticker added as a badge of adulthood. Although it expired six years ago, I neglected to remove it from by bag and have been carrying it around California and Scotland.
In college (and in grad school), our student id was our library card, but I took a law class one semester and was given a special card for use in the law library. It was exactly like my very first paper and metal card. It took me by surprise and made the otherwise pointless class worthwhile. I've had two cards in California, and I can't remember what they look like. This new card is definitely my favorite. (Must buy digital camera.)
The public library here, at least the one branch I've been to, isn't particularly impressive. The fiction section is tiny, but the organization of it is interesting. All fiction is shelved alphabetically by author, all genres together, but on the spine of each book is a label with an icon and a word or two depicting the genre. So they don't have to deal with the overhead of separate sections, but finding something specific is easy (if only it were there), and browsing by genre isn't hard since the icons are easy to distinguish and scan. I wonder who chose the taxonomy and who classifies each book. (It makes sense for the author or publisher to classify it, but only if there's a universal taxonomy.) I have the urge to label all the Gutenberg texts and train an automatic classifier. It'd be interesting to see how sophisticated the features needed to be for classification to work well. I suspect word frequency would be enough. It might also be fun to write stories that fooled classifiers.
I did manage to find a couple books to read, but my primary target today was the music section. It costs 50p (US$0.80) to borrow a CD for three weeks, and I can borrow up to 12 at a time. They don't have a big selection, but they have plenty I don't, and I was getting bored with my own collection. I'll take great pleasure in borrowing music, but I'll enjoy the books more deeply. I think written language is the greatest human invention, above agriculture, even sliced agriculture.
Since I've been fixing up both the form and content of my site lately, I've gotten to wondering how it's used. The log analyzer I use is crap, so really I have no idea. I expect there are amazingly slick commercial ones that have most of the featuers I can think of, but I thought of some more esoteric things that may not be out there. I wrote up descriptions in my to-do page.
How many other people notice common HTTP status codes in the world around them? I know someone whose address is 404, and every single time I visit, I'm amused at the irony of that number emblazened on the very door I'm looking for. On southbound highway 101 in Palo Alto, there's a big sign that accurately proclaims "Los Angeles: 404", and I smile every time I pass it. I'm sure I'm not the only one. Although I haven't actually heard anyone do it, I imagine there are people who use "404" in conversation.
"Have you seen my watch?"
"Sorry, dude. 404."
"404" could even refer to that U2 song.
But the other codes are less well known, and I'll bet they're not being properly exploited by the webgeek subculture. Now that it's occurred to me, when someone says something completely devoid of content, I'll have the urge to respond with a sneering "204" (No Content). That could be handy as election campaigning ramps up.
If a total loser asks you out on a date, slap them with a 406 or a 412! (Not Acceptable, Precondition Failed.) If you want to blow them off more politely, use the old favorite "409" (Conflict). 402 and 411 would be a bit rude in this context, especially if appropriate. :-) None of those are going to catch on, but I'm really going to have to start using 204.
I was reading jwz's club log and I got down to where he mentions someone named "Da5id", which is just a clear ripoff of a character by the same name in Snow Crash (or is he the inspiration?). (I really want to ask him whether anyone ever misspells it as "da6d".) With that reference, plus the green on black and the context of constructing a hangout, it struck me that all jwz needs now is a big katana to complete the image tie-in. Of course, jwz was himself before Snow Crash was written, but it's still weird.
Yay! I finally got my robot photos back. Also on that roll were my pictures of a really cool device that uses a strobe light to reveal the completely predictable parts of an otherwise seemingly chaotic mess of splashing water. Two jets of water collide, and with the strobe flashing at just the right rate, the spray of droplets formed by the collision appear to stand still. Farther away from the collision point it dissolves into jitter as chaos takes control. I was surprised at how constant most of it was though. Credits to Noel for an assortment of body parts appearing in the photo.
Earlier tonight I felt like watching some TV. Of the four channels, three had sports: cricket, curling, and horse racing. You'd think I'd watch less TV here than I do in the states, but somehow that's not the case.
I love when a science fiction author dreams up a social custom that makes perfect sense for the setting and presents it without commentary, as a simple fact of daily routine. It fits in with the old saying "show, don't tell". I like that saying, and the adherence to that principle was of the things I enjoyed most about The Matrix.
In my quest to avoid studying, I'm currently reading Blueheart, by Alison Sinclair. I'm enjoying it. Not only are the premises plausible and thought-provoking, but the presentation is good and the characters are more sophisticated and deeper than those in most SF stories. Weird coincidence: she was raised here in Edinburgh. Or is that why the library has her book?
Blueheart has many things in common with another good book I read recently, A Door Into Ocean, by Joan Slonczewski. The authors even have similar backgrounds, which is amusing. In one of the societies depicted in Slonczewski's sequel, Daughter of Elysium, personal privacy was a commodity. There were cameras and microphones everywhere, and if you wanted to have a private conversation or block incoming calls or just keep people from looking in on you, you had to pay by the minute for the privilege. It reminded me of David Brin's musings on surveillance, which are also worth reading. I should read the whole book, but I want to buy as little as possible until I'm back in the US. I'm tempted to start talking about his other books, but I probably shouldn't shoot my whole literary wad tonight.
Finding the URL for David Brin's essay wasn't as easy as it should have been. (Well, it was this time because it was in my email archives, but I remember it being a big pain when I wrote that email.) I really do need a personal web index.
And I think I'm going to integrate BibTeX into my site maintenance tools so I can use a central bibliography to store all the metadata of all publications I might want to refer to, rather than just URLs as I do now. Am I a complete geek or what? In doing so I'll reduce the entropy of my web site. Funny that I mention it here.
I'm tired of waiting a month for pictures, so I'm going to get a digital camera. They're not as good as film yet, but I think they're finally good enough. I'm sure I'll have more to say about it after I actually have the camera. I'll probably want to say "Ooh, I can't wait until I get that camera" every day for the next month, but I'll try to restrain myself. Instead of writing something here (no, really, this doesn't count), I've written a little bit about how much film cameras cost, with some unavoidable tangents into convenience advantages.
I'm not going to make a habit of posting news links, but most people won't see this (or anything like it) in their usual news sources.
Via BBC News:
Clan chief John MacLeod of MacLeod wants to sell the mountains to help mend the leaking roof of the family home Dunvegan Castle.
Mr MacLeod insists a 16th century charter gives him the right to claim the mountains.
He wants £10M (US$16M). Must be a big leak.
Last night I looked at the feature lists of a dozen or so web log analyzers. I was surprised at how little they did. The commercial ones have slick presentations, but are still just presenting fairly raw statistics. Only one looked interesting, and it's a university project. There must be companies selling software for this, but there are too many for me to check.
For quite a while now, I've been working out details for my music analysis project, but haven't been able to actually start doing much for it. In three weeks I'll finally get to start, and now I'm getting distracted by notions of data mining. Hmph. Too many fun things to do.
There needs to be a better way of finding good products and services in a cluttered market. Industry journals are one way, but they probably don't do big surveys very often, and they have reasons not to put their content online. Maybe Epinions will catch on. Or has it? I'm so out of touch. Their trust mechanism is too primitive though. I want to be able to trust someone's opinions in one area without trusting them about everything.
The folks at dmoz probably thought of it ages ago, but something exciting just occurred to me. If a service uses the Open Directory taxonomy, they're essentially contributing to the directory itself, since anyone using the directory can trivially link to the chunk of their service relevant to a specific category. A content provider could index their content that way, even if it's not their primary method of organization. Then the Open Directory (or perhaps a higher level aggregator) could easily link to "Related NY Times Articles", and "Related eGroups Discussions" for every category. (In fact, eGroups does use the dmoz taxonomy.)
Note to readers:
I changed the title of this page from "Seth's Entropic Decay" to the simpler "Entropy". When I put the whimsical first title up there, it hadn't occurred to me that people would actually refer to this page by that name (or that people would refer to it at all). No need to change your pages though. Wear the old reference proudly as a badge of primacy.
Today I went to a local science fair, where our robot and four others from my class were being demonstrated. There was a gaggle, no at least two gaggles of kids watching and poking the robots, some to make them react and some just to count coup. They seemed to be having a good time (the kids, not the robots). Our robot was working more reliably than it did during the competition. It'd be fun to teach a robot to play with children, using reinforcement learning and a giggle sensor or some way to detect continued interaction.
There were lots of things at the science fair, but working autonomous robots are apparently good news fodder. They got a big mention in a short newspaper article, and I'm told they appeared in the background during a local TV news bit about the fair.
I had a flashback today to the countless episodes of pencil sharpening that shaped my childhood. The ritual of walking up to the sharpener at the front of the classroom, holding the pencil and grinding away at the crank, appraising the new point, smelling the shavings as I dump them into the trash, fitting the cover back onto the sharpener with a practiced twist, and walking back to my seat as though nothing special had happened. Plus there were those special moments of annoyance at someone sharpening their entire pencil collection, balanced out by the rare indulgence of sharpening a fully grown pencil down to a nub all at once.
I was talking about pencils to someone recently, though we didn't go into such detail. I mentioned that I'd gotten a new mechanical pencil. She said she stuck with the old fashioned kind partly because she could get it just the right sharpness and partly because she enjoyed sharpening them. I didn't really think about it until it snuck up on me today.
Last night I changed something in this site's style sheet to make text margins work differently. It worked perfectly on the first page, but a second page would crash Netscape. Sigh. I want Mozilla to ride up on a big white horse and fix everything.
I first read the headline "Microsoft Falls on Analyst's Report" as a modern version of "Microsoft Falls on Own Sword", but it was referring to their diminishing stock price. Later I heard the phrase "children in the year 2000", and it initially struck me as the start of a prediction, not an observation. I wonder how long it will be before I think of the 21st century in the present tense.
Today's new word is "invigilator". From WordNet:
n : (British) someone who watches examination candidates to prevent cheating
Look before you leap. Someone has a blog called "Entropy". I'm tempted to ignore it and let the market decide, but the resulting confusion would be annoying. I'm going to continue to call this page "Entropy", to distinguish it from the rest of my pages, but the rest of you should probably call it "aigeek", to distinguish it from the rest of the universe.
There's a harp festival in town this week. I guess lots of hobbies have festivals, just like lots of hobbies have magazines. Is there a harp magazine? Is it called "Harpers Bizarre"? <moan>
From a book on reinforcement learning:
The beginning of time occurs only once, and thus we should not focus on it too much.
Will fooling data miners become a common children's prank? Imagine you and your buddy creating disposable fake contact info, expressing interest in some strange and seemingly unrelated things, and laughing your heads off when the goofy recommendations pour in. "If you like denture cream and kayaks, you might also like estate planning services!" It's already standard practice to put friends and enemies onto strange mailing lists, but this would add some poke-the-robot excitement.
I dislike being forced to pay for things I don't want, but I don't mind paying for things I do want, especially when it's convenient. I'm enjoying the Panda Cam, so I took advantage of their online donation form to help ensure a future with live pandas. I wonder how much money they'd raise if there were a panda channel on TV and people had a Donate button on their remote controls. I wish NPR and public TV had systems like that.
Instead of aigeek.com, I could have gotten geek.off.ai. Uh, no thanks. I could have gotten geek.ai too, but only by moving to Anguilla or forming a company there, either of which probably costs a bit more than the usual domain fees.
I recently visited the web page of a researcher I'd been hearing about for a while and was surprised to see how young he looked. I think there are two reasons I assumed he'd be old. One reason is that he's done research that I've heard of. There are people younger than me in that category, but most are considerably older.
The other reason is that his name is "Sebastian", and no one is named that these days. But that's not true; I've met a few young Sebastians here in the UK. It's just that the name is out of fashion in the US, so I've always associated it with old people, just like I do "Edna" and "Winston", only the association is stronger with those examples, because there are living old people with those names, whereas I don't think I know anyone in the US named "Sebastian".
I've always been aware of the associations I had between names and age, but it didn't occur to me until now that the trends are always local, and that the linkage fails when you meet people from other places. But that's what travel is for.
The word of the day is "invigilator". Two down, six to go. I was even well prepared, despite mostly finishing my web indexer over the last few days. During revisions today (that's Brit-speak for studying-for-exams), I had a moment of enlightenment, before which I didn't understand support vector machines, and after which I did. That was a nice feeling. I think I need another six-pack of enlightenment before I could use them effectively, but it's good to be on my way. (No, they weren't on the exam.)
I Finished Idoru tonight. It took me three tries to start it over the past week; I was too distracted and couldn't mesh with the rhythm. But once it caught, it was a smooth and kaleidoscopic ride right to the end. It was a bit unnerving how many elements were similar to other things in the genre, and to things in the subculture I call home.
I also took my Learning From Data 2 exam today, for which I answered questions about reinforcement learning, extracting symbolic rules from neural networks, radial basis function networks, and time series analysis. Lots of fun.
For those who enjoy learning things: NEC's Research Index is an excellent resource for browsing published computer science papers.
Scottish weirdness of the day: "pickle" does not mean "pickled cucumber" here. It means "brown lumpy stuff that might be related to pickle relish". "gerkin" means pickle, though my dictionaries define it as "those small prickly cucumbers that are often pickled, whether or not they have been". I discovered this during a conversation about (and over) bacon and pickle flavored potato chips.
In a brief note
about writing information extraction bots, Michal Wallace wrote,
"most sites do not yet use XML, so for most bot tasks, we're stuck
with regular expressions.".
Luckily, that's not true. Bots can be made to exploit the existing structure, even though it's meant for visual organization. My txt2html program was a crude attempt at that, but more sophisticated work has been done on it too. They call such bots "web wrappers", since they wrap around relatively unstructured data and provide it to something else in a structured way.
And even when you have to do it by hand, there are tools to help:
There are seven links to me in the Open Directory, and six of them point to older locations of my site. They all still work via redirects, but I wish I could somehow take responsibility for pointers to me and fix them. One category contains five entries, two of which point to me, each of which points to a different domain. They say the category needs an editor. I agree. I applied. Just to fix a link.
Later I discovered that anyone can submit the same edits as a non-editor. The changes just need to be approved before they appear.
One nice benefit of Amazon's affiliate program is that when I save a URL, it contains not only a pointer to the book, but also a reminder of who recommended it to me.
I'll warn you now, this is as geeky as it gets. (And in fact, I just did.)
I was thinking about speech acts, and wondering whether there's an equivalent in a programming language. A speech act is speech that performs an action. A lovely example appears in the preceding paragraph. I said I'd warn you, and by saying so I did. In one sense, which probably would be derided by more studious linguists, any speech can be seen as a speech act, since it acts to convey information and assert belief. Those acts could be considered just as much a part of a social contract as anything else. But pushing the fuzzy line that far is overly silly and is probably proscribed by more rigorous definitions of speech acts.
One could argue that programming languages consist entirely of speech acts (especially declarative languages like Prolog). I say x is 5, and so it is. Or was it already? Am I just describing the world, explaining existing truths and relationships, or am I creating it, describing things that didn't exist until I said them? To what extent does it depend on the language and the context? And whose speech is it? Am I only spouting imperatives? Is the computer creating and declaring at my request? Is a nonvolitional entity capable of speech acts of its own or only of relaying someone else's? (Don't answer without first defining volition.)
In some notes on UI design, Joel Spolsky opens up with an amusing example of accumulative tweaks, and makes this point:
Most prohibitive signs are there because the proprietors of an establishment were sick and tired of people doing X, so they made a sign asking them to please not.
That's often true, though it's just a special case. The more general explanation is that the proprietors expect certain things, and only make signs about those things. But the reason I bring it up is that this is just one more great thing about unicycling. No one ever explicitly prohibits unicycles. In Edinburgh though, the signs say "cycles", so I lose.
He also makes some arguments for why configurability isn't always good. I agree with him on many things, but he made a couple sloppy points I couldn't let slide.
I love online books. I still prefer paper when I'm reading linearly, as I generally do with fiction or tutorials (the first time through), but when I'm looking something up, I often find it easier to do it online. I also tend to have a book with me more often if it takes up space on my hard drive instead of in my bag.
Some free online books for fun and mischief:
I have lots to say, but no time to write. Seven exams down, one to go, lots of cramming to do before Tuesday. For now, I leave you with this gem I saw on Usenet (probably on rec.humor.funny) several years ago.
From: email@example.com (Mary Ries)
Subject: A study of 3 studies
Study leader to the group:
"We will be using 3 different methods for studying. Some people like to learn by indepth study of the meaning of the language contained in the manuscript. Others like to use commentaries written by experts who have studied the manuscript. And finally, others like to learn by practical discussion and comparison, often volunteering personal anecdotes from thier own experiences."
A student excitedly interrupts and shouts:
"Oh yes! I do that!!"
Long distance telephone calls cost 25 to 50 times as much in 1945 as they did in 1995.
Today, a self-dialed long-distance call takes only three to five seconds to dial, compared to the 1942 operator's average call set-up time of 2.3 minutes.
Long-distance calling in 1945 and the immediate postwar years required talking to an operator, who then contacted another operator to complete the call. The Bell System employed 171,439 operators in 1945, who were setting up about 600,000 long-distance calls on the average business day.
If the same ratio of operators to calling volume was applied to AT&T's 1995 average 24-hour business day, today's long-distance network would require more than 51 million operators. Instead, AT&T needs only 10,000 operators.
At the end of 1945, New York state had the most telephones, with 3.5 million, followed by California with 2.3 million. [..] In 1993, the latest year for statistics, California reported 20 million access lines--almost twice as many as New York with 11.2 million lines.
Last night I watched some of Channel 4's 100 best TV commercials. My favorite moments: