The Semantic Web – Is This Progress?

First off, a disclaimer: This blog entry is not really about the uber-Semantic Web, just a small portion thereof.

For one, I don’t fully understand it – the whole concept.

For another, I don’t care to discuss it today (even if I did fully understand it).

Understand?

Actually, I read a very thought-provoking blog entry by Gina about ridding your URLs of IDs and extensions etc.

In other words, making them more like English (or whatever language), and less like geekSpeak (spoken at many places, exspecially /.) A step toward the semantic Web.

Examples (paraphrasing Gina’s examples):

  • Bad: http://www.somesite.com/blog/index.php?entryID=123
  • Bad: http://www.somesite.com/blog/123/
  • Good: http://www.somesite.com/blog/why_I_blog/

Long story short, she didn’t like to either give away her file structure or bother people with IDs (123 is an identity column value), and she wanted to give a meaningful URL – “why_I_blog” is more meaningful than “123” (agreed!).

As someone who has built his own blogging tool (implemented locally, only), this was a great article.

I didn’t put a whole lot of thought into the way items were stored (or, more correctly, displayed) when I built the tool (building it was too much fun; honest…), but I have given it a whole lot of thought since then.

As in, a lot (plus tax…)

So, this article hit a nerve.

But I still don’t agree.

Here are some of Gina’s arguments, and my responses:

  • “…[?php.id=123 type URLs are] not good because it includes a file extension (.php) which is sure to change when I port my site to Python CGI or Java Server Pages or flat files…” This is a valid concern, but I don’t put too much stock in this. If you can’t do a code sweep to find all links (twixt anchor open/close tags) and replace, say “.php” with “.jsp”, well Houston, you’ve got a problem. (10/07 update: One issue that was NOT mentioned was that extensions will break bookmarks (index.php is now index.jsp; if pages were both “index” [sans extensions], bookmarks would work. Excellent argument for such, but Gina didn’t explicitly mention it. And I was too stoopid to realize this…)
  • Expanding on the above: The above issue sort of assumes that the same code will somehow be directly ported from your old ASP site to your new PHP site. Nope. At least, that’s been my experience. Usually, the issue is with the database, pulling data from there. That is the part that should be rock-solid.
  • “File extensions expose technical details of the site’s inner workings” As in, hmm, .php3, when are they going to upgrade? and so on. I have several responses:

    • So what? So people can see that you run on ASP vs. PHP. Big whoop. The dorks (us) will always be able to figure this out, so why hide it?
    • Non-dorks don’t even pay attention to URL structure (past the www.blah.com). File extension? What’s that? Oh, it has to be an HTML page to display! (PHP? No, that’s not HTML…). This is mixing up content types and file extensions. Two different things.
    • The method Gina recommends to hide the extension (mod_rewrite) can also be used to – hey! – rewrite the extension!
    • To a much lower extent, extensions are a help to the middle users – not the casual users, not the hard-core dorks, but people that are working on the Web and that “.” extension helps them for whatever reason. Why not?

  • While I agree that the URL that ends with “/why_I_blog” is a more user-readable (NOT more user-friendly; see below), I guess I have to say, “So what?”:
    • Who reads URLs? Dorks. Why do English URLs help me? OK, maybe it’s easier to remember “{base URL}/my_thoughts” than “{base URL}/blog/entry/index.php3?entry=123”, but how does that really help me? How many sites do you know down past the base URL? So how exactly how is this helpful?
    • Let’s say you do remember the “{baseURL}/advertise” link. What if there is a link for, basically, how to advertise on this site, and one for a how-to article on how to advertise (pretend it’s a PR site)? Yes, the links must be unique. Yes, one has to be called – at some point – something dorky to accomodate all the permutations of “advertise” and so on.
    • OK, can be helpful semantically if there is English instead of “code” (much like DNS works – www.whatever.com is easier to remember than 12.34.56.78).
    • Repeat the preceding bullet point and substitute any other language for “English.” Whoops. Unicode to the rescue? Integers (id=123) usually translate better than character-based language (better: NOT perfect or directly, by any means. Also depends on your math…)

  • Semantically-correct URLs might look nice (especially now, when they are uncommon), but this does not help either 1) Favorites/Bookmarks (uses page title, not URL info), and 2) how many people actually link to the full URL (as in “http://www.somesite.com/why_I_blog” as opposed to putting in a HREF (with full URL) and then some phrase (“Why I Blog”) as the link text? Yes, some do, and that would help, but – especially with Gina’s URLs (example: “http://www.scribbling.net/how_scribblingnet_freed_itself_ from_file_extensions_and_internal_ids” – honest). To be fair, she said she is thinking of trimming all post root URLs to 15 characters or less, but that just introduces the following problems:
    • Limits to number of permutions of links you can have
    • What are the rules for cutting down to 15 characters? Left 15 characters (including punctuation/white space)?, up through last word that falls – somewhere on character 15? What? Or will user be forced to have unique titles?

There are other issues, but – on the whole – I don’t see the reason for a semantic URL yet. I like them; I definitely hate long URL with all sorts of params passed, but I don’t see that much of an issue with “..index.php?id=123”

But maybe (gasp!) I’m wrong. I’ll have to think about it.

But I’m glad I ran across Gina’s site and saw what she had to say and how she did what she wanted to do.

Makes ya think…

Unix Tricks

As with most other computer software – OSes, applications, programming languages – I am self-taught in Unix.

This is good and bad (overall, not just Unix-specific):

  • Good: I have lots of skills and experience in figuring out the unknown. I’m not locked into one mode of thinking (“Java is for servlets” or “Java is for JSPs”). I’m not lost without class notes or my course books. Brand-new stuff doesn’t throw me – it’s all been new.
  • Bad: I’m sure I’ve learned some bad habits – no doubt. And being taught (via book, course, or with another user of the software) helps you discover things you’d have never discovered on your own – or it would have taken some time.

For example: Just today, I learned about units in Unix.

How the hell did I not know about this before?

And how cool are units?

Unix continues to surprise me. Well, not surprise, really – more like it continues to amaze me, as I’ve almost gotten used to seeing something new (to me) that’s packed into the OS. Seems to be no end to it. Which is great!

And that’s one of the reasons that I consider myself almost a Unix newbie: I write shell scripts, love the command line, always have at least two terminal windows open…but there’s so much out there.

Take the “cal” command – hell, for a Windows system, this would be a separate app: In Unix, it’s part of the kernel. Yeah yeah yeah, you can use Outlook for this to a degree…but can you find out what day, for example, the Fourth of July fell on in 1776?

Unix: []# cal 07 1776 (a Thursday, I see…)

Again, I’m continually amazed.

And I like…

Update (a couple of minutes later):

I was just going through the man pages on the site I linked above to the cal page.

And I didn’t know about the “-3” argument. (Displays the month specified, as well as the previous and next month)

Holy crap I like this stuff….

The Sun Continues to Set

Don’t get me wrong – I like Sun. They make good equipment, the Rolls-Royce of Unix machines (hey, hot swappable CPUs! Whoo-hoo!).

You often don’t need a Rolls to drive to the corner market.

As predicted in my beginning-of-the-year thoughts, Sun is in trouble.

And each quarter it gets worse.

And Linux keeps getting better, and computer boxes are becoming more and more of a commodity.

While Sun does appear to be circling around the black hole of debt, it’s still sad. But let’s face facts: The way software – and hardware – is used today has changed.

And yesterday’s SPARC Station has the processing power of today’s coffee maker.

Sun should have – years ago – woken up and smelled the coffee….

The Scripting News

No, this isn’t a review of a new E. Annie Proulx book, or an homage to Dave Winer’s blog.

It’s just some notes on scripting.

In my home office, I run two main computers – one a Win2000 box, with SQL Server on it; the other a Linux (RH 7.3) box.

I also own two domains that are currently active (such as the one that this blog resides on); like my home setup, one is Win2000 (but with Access databases; I don’t need SQL right now [if the same price, I’d be on it in a minute, however] ) and the other a Unix-type box (Concentric has a proprietary Unix; seems to be BSD-based).

At home, I run different things on different boxes, and then I need to do the following tasks:

  • Move stuff from one home box to some remote server (one of my domains), or
  • Move stuff from one home box to the other home box, sometimes to dupe the environment to test for quirks (example: I run PHP and mySQL on both Linux [mainly] and Window [quirk testing] ), or just as backup in case the one box dies, the important stuff (databases, personal files etc) are backed up for at least 10 days).

In other words, I have to shuffle a lot of data around at various times. In our networked times, this is normal.

However, I’ll be the first to admit that this is not what I do best: My skills(?) are coding dynamic Web sites that bind well to the backend (databases) and have a user-friendly front-end.

Maybe someday I’ll be really good at it, but – today – I have done more than the average Web developer with this type of scripting, but I need to do more to be really good. Just a reality. (Of course, some people download and modify a Perl script and think they are scripting wizards, but that’s another story…)

What I’ve discovered:

  • Windows scripting sucks – Yeah, this sounds like a typical MS-is-Satan statement, but it’s just a reality. Now, I’m more adept at Linux admin than Windows admin (I guess…), but the Windows scripting components are so fragmented and primitive it hurts. Is there a way to script basic operations (i.e. not though apps, such as SQL Server) like file copies except via batch files (.bat)? Seems limiting, and – frankly – it is.
  • Linux built-ins rock – By built-ins, I mean the basic stuff that is usually there (tar) or those easily installed and free (zip, gzip etc). For Windows, I had to buy a license to the WinZip command line so I could use it in batch files. I didn’t really mind this – was inexpensive (~$30-35) and I was glad to give back to a company that I’ve used free tools from for….forever – but with Linux, that’s not a cost.
  • CRON vs. MS Scheduler – PUH-LEEZ! While Crontab is a command-line thing that a lot of people don’t like (a negative), and the MS Scheduler has an expire date option (a nice), Crontab is just soo much easier, faster and cleaner (to me). One drawback of the cron system is you have to learn it, and maybe the syntax for repeated tasks (every five minutes instead of at 12:05) is not a pull-down box, but you get there. Once you do, you don’t want to go back. I don’t.
  • Both batch and shell scripts are hard – Unlike HTML, for example, scripting is a more exact science, and is not as flexible. Both seem at least loosely based on C, and C is not forgiving and not terribly intuitive (until you understand it, when it becomes great). Again, I have not done that much scripting, and most that I have done is on the Linux side, so I’m more ignorant of the Windoze scripting stuff, but shell scripts (I use the Bash shell) seems far more powerful than the DOS shell. Again, could be my ignorance.
  • Bottom Line – I do as much scripting as possible on the Linux side. I have scheduled tasks on the Windows side (zipping directories for eventual move to Linux box; SQL Server daily backups…), but whenever the task involves both boxes, I put the load on Linux…because the load ends up lighter. For example: Each day, I export a SQL Server database to an Access database and then upload the Access database to my Windows-based domain. But I currently do the first part – SQL Server => Access – via Windows Scheduler. However, I then pull that Access database to my Linux box (for backup) and then FTP it to my remote Windows box (because it’s easier this way). I’m sure I’m missing something, but…

Random Musings

Some random thoughts, that may lead to more in the future.

  • Language Firestorm – Not surprisingly, the comments by Phillip Greenspun
  • comparing Java to an SUV (not in a good way, for the most part) has ignited fierce debate on blogboards (such as /.) and personal blogs. Basically, he sees Java and JSP use pretty much overkill for most Web projects – much like a Hummer H2 is a little overboard for running to the mall and back. I tend to agree with him, but – of course – some people elected to take it a little too personally and seriously. “Slamming Java! You bastard! Java is the best of everything [blah blah…]”. Whatever. Massive Internet or intranet project? Java is a good language – the whole J2EE thingee, with calls to EJBs, not a bunch of embedded scripting code. But, for the most part (and most projects), overkill. Perl/ASP/PHP/ColdFusion are much faster to deploy and maintain for most programmers.

  • Good language != good coding – As Megnut pointed out in her observations of the preceding point, a message on /. sort of sums up a lot of what people, for whatever reason, were not saying: Bad programmers write bad programs regardless of the language. The right tool for the right job, used properly folks…this ain’t religion.
  • VeriSign Directing Typo Traffic to It’s Own Site – This doesn’t bother me as much as it should. I guess I’m not that bothered because I’m used to seeing the same behavior on the browser side with IE. That said, it’s problematic because it creates technical issues (spam filters et al). However, the worst part is that if ICANN lets VeriSign get away with this, it pretty much means ICANN is worthless. And that’s not a good thing.
  • The Object-Embedding Patent (Eolas vs. Microsoft) – This one is scary for the precedent. There are too many things wrong with the current suit (mainly, prior art), but it’s the precedent, or the potential for something like this to really happen: What if someone does come out with a valid, pre-existing patent for something like Apache or BIND? Then what happens? It’s scary. How scary? Even on the normally “Bill Gates is Satan” sites, there was actually a gruding bit of sympathy for MS. (Why? Because it affects us, not just MS…but still a rare [one-time-only] olive leaf of sorts to MS). (09/25 update: A News.com article about the preceding pretty much supports what I’ve said.)
  • Bloggers Fired – Didn’t take too much to see this one coming, but as
  • Dan Gillmor points out, journalists fired for their blog’s content is not going to end with the two cases he points out in his story. And this will be true in other industries, as well (made-up example: Engineer at Lockheed talking – in broad terms – about a new project he is working on. WHAM! You’re leaking proprietary secrets!). One real danger I see here is the whole politically correct mode the country is for some asinine reason embedded in: People are going to be fired for posting on their own blogs when their personal views (on their own servers….etc…) conflict with the image the company they work for is trying to cultivate. Especially dangerous will be any remarks regarding co-workers (or others) that may be construed as sexist/racist, even if benign and fact-based – for example, if I said, in some story about a picnic I went to that the Franklin family ate 90% of the watermelon consumed. Oops! The Franklins are a black an African-American family. I’m fired! (even if everyone else there were African-Americans, because I’m not…) And so on. But that’s the life of a writer, which bloggers are, like it or not. Not always skilled writers, mind you, but we all are.

And the Sun Sets

There was an interview in the San Francisco Chronicle with Scott McNealy (CEO/co-founder of Sun) that was pretty interesting.

I don’t know, maybe he was just tired or the editors were peppering him right and left with uncomfortable questions, but McNealy came across as … as a “dick.” At the very least, cranky or testy.

And McNealy is chairman, president and chief executive officer of Sun, so how can you account for this statement by him (in response on what to tell kids who want to go into tech – good/bad?): “I’m not a great visionary.”

HUH? So what exactly is it that you do?

And, in case you think I’m taking one part of a longish interview too seriously, McNealy was asked to comment on a Michael Dell comment that the future of technology does not belong to proprietary systems (such as Sun’s Solaris vs. Linux on software, for example).

McNealy stressed the importance of R&D; (OK, but what does R&D; really have to do with proprietary systems?? IBM is doing a lot of R&D; on Linux..) and then uttered: ” How many extra people would be working in the auto industry if Henry Ford hadn’t figured out, ‘We’ll give you any color you want, as long as it’s black.’ ”

HUH (again)?? What does that mean?

Well, McNealy meant that there were benefits to keeping all technologies in the “flavor” (so there are no interoperability issues, for example), but this introduces (at least) two problems for McNealy:

  • Isn’t this an argument that supports the Microsoft juggernaut? (McNealy’s favorite bashing post) MS is all MS, all parts (OS, Office, SQL Server, CE etc….) – you buy into the platform, the rest will seamlessly integrate (the concept, OK?)
  • Isn’t this the Ford quotation that everyone holds up as an example of how Ford blew it: Yes, he changed (interchangeable parts/assembly line) the making of autos – but refused to acknowledge that others could do this as well (why not other colors?) and that the customer wasn’t always right, Henry Ford was, dammit! Oldsmobile came in with colors (red, I think, to start) and hurt Ford’s sales.

Ouch.

On the other hand, McNealy has always been this way – and, in many ways, it’s refreshing. While he always harps on the same things (MS, antitrust,Dell….), he doesn’t do as much weasel-speak as others in his type of position.

But I still see the sun setting on Sun sometime shortly…buy out, merger…Probably not bankruptcy, they have dough in the bank.

Still.

More Warren

I mentioned in an earlier post that Warren Zevon had finally passed away – I think it was a week ago.

I just got his new CD, which was written in the space between diagnosis (terminal) and death. It’s garnered generally very positive reviews.

I don’t know. I’m a Zevon fan (not fanatic), and I thought it was interesting – 2 or 3 strong songs – but that’s it. Obviously, hold a mirror of impending death to this picture, and the effort appears stronger/heroic or whatever.

But I think Zevon wouldn’t like that: The music should stand for itself – written in the lap of luxury or facing death, no difference. Shouldn’t matter.

That said, the last cut on this, the last Zevon CD (and he knew it) is both a brilliant song and a perfect closing to a turbulent, inventive musical life.

The song cuts; even if you didn’t know the circumstances, it would be a powerful statement – in a very understated, non-sentimental way. It’s simply a song of someone saying goodbye, letting go, understanding that the end is near.

Knowing what we all know – that this is not a device, it’s the truth – it’s even stronger.

But again, the music should stand on it’s own.

But how can you brush aside lyrics like this?:

You know I’m tied to you like the buttons on your blouse

Keep me in your heart for awhile

– Keep Me In Your Heart

For this cut alone the CD is worth owning.

Changing Business Models

OK, we’ve heard a lot of negative press – and industry bluster – about the actions of the RIAA against file-swappers and SCO vs. well, everyone it seems.

In both cases these companies/groups just can’t get grasp that their well-defined, years-old business models are, well, old.

As in outmoded.

However, instead of going with the flow and attempting to make a profit (not to mention make a good impression to the masses) by embracing and – potentially – expanding this new model (as Apple with iTunes and RedHat with Linux have done, for example), they are digging their heels in, strapping those business blinders on tightly and screaming bloody murder.

Or more correctly, having their lawyers say “I sue you … and you … and you…..”

But we all know about that, and we also know that – no matter the outcome of all this (music or open source software) – the genie is out of the bottle in both cases, and there is not going back to the old way. Sorry, the new horseless carriages are upon us.

One nice thing I’ve noticed recently is something I have not noticed: Even with the job situation in the crapper, especially for the tech industry, there has been very little hue and cry about jobs going overseas. For example, India (esp. Bangalore) seems to be getting every job that used to be in Silicon Valley. Wow.

Yet – fortunately – you don’t read much about this. News reports, yes, and many are concerned etc…but this is, again, a change in business models.

Tech can’t dig it’s heels in and scream bloody murder. It has to adapt. Right now, labor is so much cheaper overseas, and the global economy and wired world makes it almost the same to hire a group in India as one down the road to do whatever. You never (well, rarely) meet the employees who actually produce the work, just the talking heads.

So what’s the difference – in a general sense – where they are?

Really isn’t.

So it’s good to see this issue not blown all out of proportion.

Good for us.

And Now Warren Zevon Can Sleep

After a long illness, Warren Zevon died yesterday.

It don’t matter if I get a little tired

I’ll sleep when I’m dead

I’ll Sleep When I’m Dead

He seemed to take it all in stride, and kept plugging away at what he loved – music – till the end.

I don’t think that makes him a genuis, but it certainly embodies the soul of an artist.

Keeping with today’s theme….PHP Thoughts

I can approach this entry two ways: Be politically correct (geek-correct, in this case) and avoid the firestorm, or just say what I mean and hope folks get it in the context in which I’ve supplied it.

I’m opting for the latter, if for no other reason than….I’m not politically correct.

And it’s faster. If you’re upset by what follows…well, sorry. And OK. I’m not trying to please you.

I’m just giving an opinion.

Onward.

I read an interesting bit of info on the Netcraft newsletter for September.

In a nutshell, it said that there was surprising growth of PHP on Windows; PHP is currently targeted to overtake ColdFusion on Windows as the second most popular scripting language (behind ASP) on Windows sometime next year (2004). From the looks of the graphs, CF use was increasing, but PHP was rocketing.

I found this interesting.

I posted this to the House of Fusion CF Talk Archive RE: a question about if CF use was increasing/decreasing.

My link to Netcraft and analysis (“does not bode well for CF in general”) was of course flamed. But it’s a CF list, so OK. Some folks were logical, some gave thought to responses, others responded in the best “{pick your poison} is the devil!” mode.

Whatever.

But I was thinking about all this today, and one thing stuck me that I had not thought about before: Most Linux distros come with PHP; often the default is to install this and mySQL (another freebie; a database). Obviously, CF – an expensive product ( ~$1,000 for single server; though a single IP server is free with most of it’s products) – is not bundled on the OSS disk – but it does run faster on Linux than Windows (!).

OK.

So, to run PHP on Linux is, well, either expected or the only choice beyond Perl (or JSP if you want to go that far; let’s not. No CF, no ASP are freebies).

So one expects PHP to be on Linux.

While PHP is free on Windows (ditto Perl), you still don’t expect to see it there. You expect ASP, which is supported by IIS, free from MS with all main products. PHP and Perl are NOT part of the normal Windows install.

Interesting Part: Yes, installs of Linux are happening right and left. Same for Windows (pick a flavor). PHP is part of Linux sorta; definitely NOT part of Windows…yet PHP on Windows is climbing.

You have to make an effort to put PHP on Windows; on Linux it’s usually there.

We are lazy folks…if we go out and seek the install it means we really want it (as opposed to, “oh, yeah, we have {blah} let’s use it..”).

Yet PHP is climbing on Windows. Huh.

Sure, it’s free – that helps a lot (can you say “IE”?). But PHP usurps a lot of what both CF and ASP does, and integrates nicely with IIS (I just installed; no brainer).

Better to run PHP and mySQL on Linux/Apache than on Windows/IIS? Sure (to me).

Same price on both? Yes – free (get beyond the OS cost; we’re talking installed base).

Is there something that CF offers that PHP cannot match? Hmmm.. simplicity.

Which is more powerful? Probably PHP, with it’s roots in Java and Perl.

I dunno, you can keep going on like this – and be right or wrong – for hours. The bottom line is that such questions are valid (not “Can I use my Window 3.1 box to run my E-commerce site???”); the knowledge that so may exist is … interesting.

I like CF. I’ve been using it heavily for years. It’s not perfect; is C? NO!

Right tool for the right job.