Notes from (the software) Underground

Well, not really underground. Just some notes in general.

VB use is slipping – according to a news.com story, the use of Visual Basic is slipping. While this is not that surprising to me (software has grown up – and passed up VB in many ways), the interesting part is what users are moving toward: 39% who are decreasing VB use will be turning to C#, while only 31% will turn to Java. While C# is a MS product – and right in Visual Studio – it’s still a bit surprising because Java has been around so long, and C# (MS’s answer to Java, essentially) is so new. On the other hand, who are those VB developers? Those using Visual Studio, those MCSE (or whatever it is…) — the Microsofties. So I guess it’s not that surprising…but should alarm Sun a bit… Like ’em or hate ’em, MS does a good job of keeping developers and keeping them happy. When Java was a potential point where MS could lose developers, MS developed their own Java – C#. Keeps the defections down.

The world in two (mebbe three) servers – The May, 2003 Netcraft server survey is out, and no surprises. Apache and MS both way ahead of the pack and essentially flat over the past few months in terms of percentage gain/loss. Apache is the leader by far, but my guess is that higher-volume sites (example: Dell.com) make the number of pages served up by Apache and IIS pretty much equal. Apache runs a lot of small sites (and some really large ones, I know, I know….), and IIS is generally not used unless the company is large enough to have a fair-sized Web site (otherwise why pay all that – whether you need it or not – to MS for licenses etc??). The interesting part to me is the plunge — pretty consistent over the past few years — of Netscape servers. Netscape servers were never high in physical site numbers, but ran large sites (such as cars.com), so the pages-served metric worked like the one for IIS today. But the number is soooo small. I think Sun is dying…

Microsoft Project – According to another news.com story, MS is updating its Project tool and moving more (all?) of it onto the Web (in the form of a server, I assume, like Exchange for Outlook integration) so collaboration can be more seamless. Well, file that under DUH!. That’s always been one of my main complaints against MS Project (the other — too frickin’ non-intuitive): There is an owner of the project plan (hopefully…) that makes the updates and then spams everyone on the team with this huge file. And there are too many phone conferences where you can hear people saying: “Was it the one on this e-mail Tuesday??” “no, that one had a mistake, use the Monday 2pm one…” and so on. Obviously, putting the plan on a shared drive is better, but for folks outside the firewall??? This is a tool that cries out to be a Web app, or at least — like Exchange/Outlook — something that can be accessed off a server by a client program.

CSS flak attack – There have been a lot of electrons zipping around the Internet backbone recently carrying information – pro and con – about CSS (see previous blog; May 3, 2003). It’s been a pretty polarized argument, for the most part. There have been some pragmatic reports that I’ve read (links that are lost in some ancient history bin), but for the most part I’ve been seeing a lot of one of the two following points of view:

1) I can’t make CSS (do something). Therefore, CSS sucks. Why use it?

2) You Luddite, just stand on your head, rub your belly, assign a karma:good attribute to [whatever] and it will work unless you have SP4 installed….

Yes, both arguments are silly. But both are valid, to a degree. CSS is the future (like it or not), yet it has serious limitations today for whatever reasons (get over it). Good to have the debate because it needs to be had. Will move things in (hopefully) a constructive direction once the dust settles.

CSS — Glows or Blows?

There is an interesting article and message thread going on at Live Journal — an article by Jamie Zawinsky.

Basically, the article is a rant against CSS (why it doesn’t work as well/like tables/font tags etc) and CSS supporters (“Web designers, and especially blogging web designers, are self-important fuckheads” and “dumbasses” ).

And I guess I fall into the “fuckhead dumbass” category because I like CSS – but I do realize it’s limitations (browser incompatibility; positioning issues and the lack of a true table-like column structure are the big three to me) and get frustrated by it.

A lot of good points are brought up — and some good code/insights shared, but I think some salient points were missing.

I would have commented, but you had to create a profile and all that. I don’t care that much – and the world/thread is not going to be a poorer place without my pearls of wisdom…

What was missing:

  • CSS is new; font and table tags are old — If the tables (pun intended) were turned, and TABLE and FONT tags came out after CSS had been around for years…well, I don’t think they would even be used for the most part. Tables for columns, maybe. But why use the FONT tag? It would seem idiotic to assign a million different FONT tags when you can do it with one style, a few lines of CSS code. It’s only because we have been using table/font for year that we are so familiar with them. And like all humans, web developers hate change. CSS is cool is some ways, makes you tear your hair out in others, but it is something new to learn, which many don’t like.
  • CSS is new – While the standard goes back to 1995, I think, it has only really been embraced over the last year or two because the browsers never really caught up until then. So this is new stuff – and there are going to be learning curves for developers and those who develop/support the standards.
  • If it is so bad, why is so widely embraced (now that all browsers sans Netscrape 4.x give pretty good – but not equal – support to the standard)? While the CSS supporters do have more than a few CSS fanatics (fuckheads/dumbasses….) in their numbers, this is even more true for Mac users/supporters. Should we toss out Macs, as well?
  • The thing that kicked off Jeff’s rant was the problem with adding CSS code to make his pages more printable (so white text on lite blue could be read when someone printed them out) – Let’s put this in perspective: Even if this was a difficult problem (it isn’t), it’s not something that is even available in the “old school” of formatting. He’s bitching that CSS sucks because it’s work to make something work that he could never get to work in the non-CSS world (unless you had a separate print version page). Isn’t it worth the effort? Isn’t it cool that it is possible – however currently imperfect the process – to do so?

Second Kick…

OK, I can do the include on my page IF I set the page to a “.shtml” page.

This always annoyed me: If it’s a DOT “asp/jsp/php/cfm” whatever page, the decision has been made. Parse or not to parse (and/or include).

So, for now, I will not have the blogroll on my site (I’ll work on a static one [generated on my side dynamically….] ).

This should be so freakin’ easy…

Kick in the pants….

I’m looking at the entry below – recently posted – and I’m looking at embedded (no, not like the journalists…) text on my site (generated, as it is at this time, from blogger.com.

(No, not PRO yet…*sigh*…. I’d like the functionality and to give Ev some bucks….not in the cards right now…)

OK, I’m writing this friggin’ entry about how I’ve done what Dave/Brian have done in PHP blah blah…

And I have no blogroll on my BLOG THIS site?

Why is that?

Hmmm…..

  • I haven’t gotten around to it.
  • Who cares who what I read?
  • My current site does not fully support PHP, so I can’t use existing code

Nice excuses, I note……..

I guess I do have to write some sort of tool to make a blogroll (include file) on this site.

Hmm…I like this idea.

Because I’m not certain how to do it…but I’ll figure it out, I’m sure.

That’s the fun part.

Code on….

Blogroll

OK, this started with something Dave Winer wrote, and it got me interested.

I think I have the blogroll solution done for my own tool that I’m building (for fun/learning — not good code…yet…):

And here it is imbedded in the page; the blogroll list is on the bottom right.

I differed from Dave’s design (Bryan Bell’s?) by putting bigger arrows to the right and left instead of stacking them.

I tried out Dave’s solution (which he put online for play; I don’t know if it’s still active), but I had an issue with the close proximity of the arrows. I like his/Brian’s design better than mine, but mine is easier to hit with a mouse. I dunno. I’ll have to play for a bit and see which is better. I guess all I can say is that my initial (first impressions = important) impression of Dave/Brian’s design is that it was difficult to hit the buttons — arrows squeezed in a small space.

Beyond that, testing tells.

PHP solution (against PostgreSQL).

Actually, I’ve had another solution done rather quickly, but it involved too many queries.

The current solution requires only the queries actually necesary to make a change to the database — there are no queries to get “max/count([whatever])” or what have you.

Now that is all handled by array manipulation.

I still don’t think I have it optimized, but this is a step in the right direction.

It looks like I eliminated three queries, which were mostly redundant.

This is a higher level of array work than I have done in the past, but it was a good project. It’s one of the reasons I did this second version.

NOTE: PHP still does not have the best array handling functions. Don’t get me wrong – it has a million functions, but a lot are sorta silly or incomplete.

Example: arrary_shift() — pushes the first element off the beginning of an array.

OK, good start.

But why no way to delete arrary[3] directly? Need to unset and then either: reset() or reset with call to array_values to eliminate pointer.

To me, if there are four elements in an array (0-3), and I want to remove the third element (2), I should be able to do this and have the pointer for the third element set to the former fourth element (3).

Doesn’t work that way now, that I can see.

Learning curve, but I like that….

More tools in my tool belt…

What Blogs Begat

OK, something I’ve been mulling lately.

In a recent screed, I pretty much outlined my beliefs about blogs:

  • Blogs are cool and fun to do, especially for the word-inclined
  • Most blogs aren’t worth the paper they aren’t printed on except to the people that make them, or a “small circle of friends” (important distinction: Hey, my taste in books is worthless to others. I love my books and like my taste. That sorta thing)
  • While blogs are changing things, they are not creating the earth-shattering upheavals many seem to posit for this medium

Right or wrong, that’s what I believe.

But let’s look at that last point: “…blogs are changing things”

While I don’t believe they will be the end of journalism as we know it today, just as radio or TV didn’t end the types of journalism (print => radio => TV => Web => blog) that came before each, blogs will change and extend journalism.

Each medium extended journalism – and, yes, to the slight detriment of the other conduits. That’s to be expected. But wipe out this conduit or the other? No way. Look at the most ancient conduit: word of mouth.

Still around. Don’t tell me you haven’t seen a movie, read a book, possibly even voted for a politician based primarily upon what someone else has told you.

Right?

OK, we’ve flogged that horse to death – and that’s not what I wanted to say here.

Blogs are changing things, but I think one of the most influential roles blogs – the whole blog movement – are playing is that to facilitate change.

I’ve been reading Dave Winer more closely the last few months, and through him and other folks I read/have corresponded with, I’m seeing a trend of blogs speeding change and standardization – the target is blogs, but it is spilling over to other areas.

Was that obtuse enough? Some examples:

  • I don’t know – and don’t much care – about which of RSS or blogs came first, but blogs are the first place I have really seen RSS take hold. And now these feeds are popping up in places like news.com and other non-blog sites. It’s getting fairly commonplace on tech sites (understandable), but if it’s not a fad it will bleed into the regular Web news sites and then beyond. That’s good.
  • There was an article Dave pointed to today that talked about RSS-to-iPod (Apple’s) software. As other sites – such as CNN – begin to deploy RSS, suddenly your iPod carries CNN headlines. We are reaching that convergence point, and it’s because of blogs…indirectly.
  • The whole concept of journalists posting ALL notes/full transcript of interviews etc on their blogs (sanctioned/required or not by news outfit) may drive things to the point where journalists blog interviews with foreign leaders just like the gang at Boing Boing blog tech conferences. How cool would that be?
  • There seems to be a very strong open-source developer community behind blogs, and a lot of impressive folks (Dave W., Ray Ozzie etc) are putting code or concepts out there; the community responds, BANG! Better software (or, at least, a VERY short “vaporware” interval).

This – to me – is an interesting sidebar to the whole blogsplosion.

Worth noting; worth watching.

As I said sometime earlier, no one is really sure what the whole blog thing will lead to (if anything); it may stay the same or change considerably.

What I didn’t consider was it’s impact on those around and in it.

Me likes.

Power-grid broadband

I read an article a few days ago on ZDNET about the FCC giving at least preliminary approval to powerline-based broadband Net connections.

Interesting, but while this may make broadband more accessible (especially in outlying areas — excellent for remote areas), there are other potential ramifications to this type of Internet distribution that I have not seen addressed:

  • Who gets to do that actual providing of access? Sure, the powerline becomes the pipe, but there has to be an ISP (even if it turns out to be the power company) somewhere to peer into the network backbone, give e-mail addresses etc.
  • Speaking of peering, how will this work? Will the power lines become part of Intenet, or walled off and it’s just like a giant T1 line that goes just to that company’s lines.
  • Power surges?

Actually, at first I was worried about home networking — if every plug is an Internet connection, why would Linksys be needed except for firewall? — but it still requires a modem (the size of a deck of cards, the article says).

Which brings up another point. At what point will ALL modem stuff become standardized — the way Wi-Fi has (such as on PCMCIA cards). So it’s built into the computer, and there are not even small (deck of cards) modems and all?

Or built into the router? Here, I have a cable coming into a modem about the size of paperback book (bigger one). That goes — now all via Ethernet cables – to my router, which only then goes to all machines (so all machines have the firewall, DHCP etc.).

While I understand why the modem is needed, it basically is just a huge box (that requires a power plug) that’s only providing a cable-to-ethernet adaptor.

CSS Testiness

OK, I’m a huge fan of CSS. To paraphrase the Amex commercial, “I don’t build a Web site without it.”

Yet I have complained (from time to time) about shortcomings in CSS, real or perceived.

One complaint I’ve been voicing (under my breath, to myself) recently is the lack of variable scope in CSS.

There aren’t any varibles (much less additional scope) in a CSS file, unless you parse out a NON “.css” file (myCSS.php) and make replacements in this way.

This works, but then it is not as flexible.

Yes yes yeah yeah….CSS is spozed to be this static include file blah blah.

But just as the Web — with it’s static HTML pages — evolved to a very dynamic code-generation system (“No database? That is so 1996..”), one would think that this logic would be available for CSS. Doesn’t appear that it is. CSS is killing me.

Personalization and ease of maintenance calls for CSS to be more dynamic.

Is this possible – in a regular CSS file???

I don’t know of it, but I’d like it.

Take a simple example, for one area of a site without personalization: The same colors echo – either as “color” or “background-color” throughout the style sheet. If I want to change the background color from black (with white text) to white (with black text) I have to do all sorts of switches, including those for hovers, links and so on.

Isn’t there are programatic way — within the style sheet — to make these variables (“$dark = #000000; $lite = “#ffffff”)….well, variables?

Yes, I can do it through scripting, or “search and replace” …but…why?

Why can’t I change three variables in a “.css” file and have that cascade?

Interesting thought. I’ve done a bit of work with this, with personalization (either from a database or cookie) and applied these user preferences to a NON-css file (a file that can be parsed so the variables I pass to it are captured) that is then written out as a

write.

I’ve also done it so each user – upon selection of either schema or personalized choices – gets a “unique” style sheet that is then written out and included each time (this seems more efficient; also more clumsy – if cookie or db values, why not just include them into an included file? Want 12,000 “[user_id].css” files clogging up your file system??)

I have to get deeper into this…

04/16/2003 update: Yes, I know you can use JavaScript to make the changes (I do this on the “Text size” link in the menu), but — again — this is programatic, not variable-based. Still have to, for example, change every instance of “#000000” to “#ffffff” or whatever. Not just saying “$darkColor = #ffffff” and having it apply globally.

Googleperplexed

Or – let the Google-bashing begin!

As noted is some earlier entry, Google is the crosshairs of the technorati now, simply by virtue of its success.

Yes: In America, kicking someone when they are down is bad manners, but kicking someone when they are up, hey, that’s just American!

The latest GoogleFlak is the analysis Google’s of SafeSearch option by Harvard’s Ben Edelman.

Edelman’s conclusion: SafeSearch blocks thousands of innocuous sites (example: “Hardcore Visual Basic Programming”).

My reactions:

  1. GASP! I’m shocked! Stunned! Amazed!
  2. Nice catch – this might push Google to better with this
  3. WHO CARES/SO WHAT!?

In order:

GASP! I’m shocked! Stunned! Amazed!

Why should Google be any different than the other filters out there? There are companies whose entire purpose is to correctly filter out the naughty sites, and they unfailingly block sites that are useful (CDC, breast-cancer sites, and so on). Especially in an automated fashion, it’s tough.

That’s one of the reasons that the ACLU and librarians don’t want to have to install filters on library computers: So much good stuff will be blocked out, as well.

While Google is certainly positioned to do a better job than the net filters, I never really imagined that Google would do that much better (see No. 3 for more on this), at least at first.

Nice catch – this might push Google to better with this

It’s always good to have watchdogs out there – for causes I believe in, those I don’t, those I don’t care about. This group examination is a good “checks and balances” type system. Good for Ben.

It’s doubtful that any harm will come from this analysis/publicity. Yes, Google may have to work a little harder, but that will earn them some respect etc. We can all win. Good for Ben, again.

WHO CARES/SO WHAT!?

I read so much hand-wringing reaction over this “discovery,” mainly on blogs but in tech columns, as well; I just don’t understand the fuss.

Let’s look at a few facts and observations:

  • Google never promised that the SafeSearch filter was 100 percent accurate.
  • What is 100 percent accurate? I think access to information about contraceptive choices should be allowed through; you may think this is unsuitable for your child.
  • Google’s response to this study is that they try to err on the side of caution: Whether or not this is true, it seems to be a good policy. Kind of like the “innocent until proven guilty” concept. If in doubt, suppress. AND NOTE that this suppression is not
  • censorship. The user turned on the filter, and can always turn it off and resubmit.

  • You don’t have to use the filter – Unlike the debate over library filters, Google can be used in two ways: Filtered and unfiltered. Feel like you’re missing things? Turn filter off (this is the default). Getting too many naughty links for your taste? Turn the filter on. Your choice.
  • Google is not in the business of this type of filtering – the accuracy of their filter is probably not as high a priority as other projects/tools. Let’s be realistic.(Note: I’m fully aware that Google is, basically, a harvesting and filtering company, so filtering [indexing, page rank etc] is key to its operation. But not in the “naughty or nice” way — at least not currently)

I don’t know, while it’s nice that the study was done and hopefully shared with Google, I just don’t see what all the fuss is about.

It’s as though people expected Google to somehow do a perfect job of this peripheal project. Why?

And has anyone examined, say, Yahoo’s protected search and see how much better/worse it does? I read nothing about this concept in any of the articles/blogs I read.

Hey, the Google porn filter could be 100 times better than Yahoo’s (or Teoma’s etc…); it could be 100 times worse.

Let’s see some comparisons, and then we’ll have something to talk about.

========

Note: I wrote to Dave Winer about this; he forwarded my message to Ben. Both sent back nice, sometimes-not-agreeing messages to my thoughts. Excellent. I like the give-and-take; it clears the mental cobwebs.

I guess where we still have to agree to disagree is that, while Google has a bunch of really smart techies, filtering is not high on their priority list to me. Dave and Ben still hold to the “surprised Google didn’t do better” stance; I’m not. It’s not on their radar (should be; profit center…..).

Ben’s note was the most detailed; reproduced below:

Lee,

Your thinking as to the context of this work, its value, and its reception

in the community and media is generally consistent with my own.

I do disagree somewhat with your suggestion that there was no reason to

think Google might do a far better job in this field than anyone else. They

have quite a talented bunch of engineers, with impressive results in major

fields (i.e. core search quality!). They also have a huge database of

content in their cache. And I, at least, have found it difficult to get a

good sense of just what AI systems can do and what they can’t — on one

hand, they’re clearly still imperfect, but on the other hand I’m sometimes

shocked by just how good they can be. All that’s to say — I started this

project with the sense that SafeSearch might well get a clean bill of

health.

My real focus here, though, did become the “push Google to be better with

this,” as you propose in your #2. The service has been in place for three

years without, I gather, any large-scale investigation of its accuracy or

effectiveness. (And I say that with full readiness to admit that there’s

lots more I, or others, could do; I’m not sure I’d call what I’ve done so

far a “thorough” investigation, given the millions of search terms and

billions of result pages not checked.) I’m hopeful that my work will cause

Google to reevaluate some of their decisions and, perhaps most importantly,

improve their transparency and documentation as to how the system works.

As to the “who cares” reaction — there’s always the potential, in blogspace

as well as in commercial news sites, for a story to get overblown. I’m not

immediately prepared to say whether that’s what’s happening here.

Personally, I think coverage like that on http://dognews.blogspot.com/

(see the 3:31PM post of yesterday; their deep/permanent links unfortunately

aren’t working quite right at present) isn’t such a bad thing and doesn’t

make the world a worse place!

Anyway, thanks for the clear thinking here and the explicit taxonomy of the

several approaches to this project. That’s a nice and, I think, helpful way

to present the varying perspectives here.

Ben Edelman

Berkman Center for Internet & Society

Harvard Law School

http://cyber.law.harvard.edu/edelman

Not a Troll

Hey, my entry below ‘dissing mySQL was not meant as a troll.

I said:

When I started this project, I sort of gave in and had it running against mySQL — while I hate it, it is the dominant open-source database (for better or for worse…). MoveableType runs against this, and MT is used all over the Blogsphere, so whatever…

As I began coding and wanted to do stuff, however, I quickly ran out of obscenities to use for this sad excuse of a database.

Sorry, mySQL doesn’t do it for me.

To me, mySQL = Microsoft Access. Both do a lot and a lot well. For 90% of the uses out there, this is all that is needed just about all of the time.

And both databases are incredibly simple to set up (hell, I’m still screwing with an Oracle install on one of my Linux boxes. What a pain in the keester!). Postgres is a little awkward to set up (“Is the postmaster running on port 564” or whatever errors) — you have to create users, initdb and all that.

For me, mySQL was a no-brainer (perfect for me!) to install: Installed (from RPM) and it was there. Bang. Simple.

Access, of course, is just “there.”

So both Access and mySQL have merits, but not for running a high-volume, highly transactional Web site.

Yes, my opinion. But look at how often Slashdot – Perl/Mason against mySQL goes down. Daily. I can’t imagine it being the code (though the code is pretty convoluted – download the TAR and look at it. Messy!).

Ditto for the new Internet meme I wrote about recently: Blogshares.com – PHP against a mySQL database. While the traffic volume may well have played a role in it’s at-least-initial instability/slowness, I think the database was a bad choice. First of all, I think mySQL is just not hardy enough for it, and this is a site that screams out for stored procedures – which are not supported by mySQL (and still will not be in the next [4.1] release).

mySQL seems to do well with simple selects, but even this “talent” is being usurped by Postgres, at least according to a fairly impartial test of the two by Tim Perdue of phpbuilder.net.

And while my Blogging tool will probably never move off my home (behind firewall blah blah) box, while it will probably never be used for actual production of my or anyone else’s blog, it is still designed to be used by multiple users with high volume.

At least that’s my goal — so why not set the bar high?

RE: blogshares – While I do think that mySQL is not a good choice for this site (lots for reads and writes; data-integrity issues; transactions issues [that mySQL does not support] ), I fully acknowledge that it’s tough to find a host that will run Postgres for you. (To be honest, I don’t know if it’s even possible…)

The Linux hosts all come with Perl, virtually all offer PHP (at least in the upgrade package).

Databases are a different story. Usually there is mySQL and sometimes mSQL — and the database option is often an upgrade. This is changing slowly, but still, it’s rough to get a good/straightforward database hosting on Linux. (This is also true on the Windows side, with different databases: Access/MS SQL Server/FoxPro.)

So the user is pretty much stuck with mySQL (mucho better than mSQL – at least from what I read…).

So I understand the choice. I just think it’s a bad one that is going to be problematic, and — guess what? — it already is.

That said, I see the reasons that MovableType.org went with mySQL:

  • Like it or not, that’s the database you can get from a Web host. ‘Nuff said.
  • While I have serious “issues” with mySQL, it does do well in “selects only” areas. And what is a blog? ONE person makes updates (inserts), the rest is reads except for a possible comments section. Like Access, mySQL is well-suited for this.

That still does not explain why Slashdot has not converted over to either Postgres or Oracle: This is a highly-transactional site.

In addition to users clicking around to stories and comments, there are users adding comments, users meta-moderating, user being added/edited and so on.

There’s a lot of shit going on.

And – about once a day, it seems — that “lot of” hits the fan…